Group Activity Recognition using Unreliable Tracked Pose (2401.03262v1)
Abstract: Group activity recognition in video is a complex task due to the need for a model to recognise the actions of all individuals in the video and their complex interactions. Recent studies propose that optimal performance is achieved by individually tracking each person and subsequently inputting the sequence of poses or cropped images/optical flow into a model. This helps the model to recognise what actions each person is performing before they are merged to arrive at the group action class. However, all previous models are highly reliant on high quality tracking and have only been evaluated using ground truth tracking information. In practice it is almost impossible to achieve highly reliable tracking information for all individuals in a group activity video. We introduce an innovative deep learning-based group activity recognition approach called Rendered Pose based Group Activity Recognition System (RePGARS) which is designed to be tolerant of unreliable tracking and pose information. Experimental results confirm that RePGARS outperforms all existing group activity recognition algorithms tested which do not use ground truth detection and tracking information.
- Shu, T., Todorovic, S., Zhu, S.-C.: Cern: Confidence-energy recurrent network for group activity recognition. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) https://doi.org/10.1109/cvpr.2017.453 Bagautdinov et al. [2017] Bagautdinov, T., Alahi, A., Fleuret, F., Fua, P., Savarese, S.: Social scene understanding: End-to-end multi-person action localization and collective activity recognition. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) https://doi.org/10.1109/cvpr.2017.365 Perez et al. [2022] Perez, M., Liu, J., Kot, A.C.: Skeleton-based relational reasoning for group activity analysis. Pattern Recognition 122, 108360 (2022) Gavrilyuk et al. [2020] Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020) Thilakarathne et al. [2022] Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Bagautdinov, T., Alahi, A., Fleuret, F., Fua, P., Savarese, S.: Social scene understanding: End-to-end multi-person action localization and collective activity recognition. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) https://doi.org/10.1109/cvpr.2017.365 Perez et al. [2022] Perez, M., Liu, J., Kot, A.C.: Skeleton-based relational reasoning for group activity analysis. Pattern Recognition 122, 108360 (2022) Gavrilyuk et al. [2020] Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020) Thilakarathne et al. [2022] Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Perez, M., Liu, J., Kot, A.C.: Skeleton-based relational reasoning for group activity analysis. Pattern Recognition 122, 108360 (2022) Gavrilyuk et al. [2020] Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020) Thilakarathne et al. [2022] Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020) Thilakarathne et al. [2022] Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Bagautdinov, T., Alahi, A., Fleuret, F., Fua, P., Savarese, S.: Social scene understanding: End-to-end multi-person action localization and collective activity recognition. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) https://doi.org/10.1109/cvpr.2017.365 Perez et al. [2022] Perez, M., Liu, J., Kot, A.C.: Skeleton-based relational reasoning for group activity analysis. Pattern Recognition 122, 108360 (2022) Gavrilyuk et al. [2020] Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020) Thilakarathne et al. [2022] Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Perez, M., Liu, J., Kot, A.C.: Skeleton-based relational reasoning for group activity analysis. Pattern Recognition 122, 108360 (2022) Gavrilyuk et al. [2020] Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020) Thilakarathne et al. [2022] Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020) Thilakarathne et al. [2022] Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Perez, M., Liu, J., Kot, A.C.: Skeleton-based relational reasoning for group activity analysis. Pattern Recognition 122, 108360 (2022) Gavrilyuk et al. [2020] Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020) Thilakarathne et al. [2022] Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020) Thilakarathne et al. [2022] Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020) Thilakarathne et al. [2022] Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Thilakarathne, H., Nibali, A., He, Z., Morgan, S.: Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33(6), 95 (2022) Zheng et al. [2023] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N., Shah, M.: Deep learning-based human pose estimation: A survey. ACM Computing Surveys 56(1), 1–37 (2023) Lu et al. [2019] Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Lu, L., Di, H., Lu, Y., Zhang, L., Wang, S.: Spatio-temporal attention mechanisms based model for collective activity recognition. Signal Processing: Image Communication 74, 162–174 (2019) Kreiss et al. [2021] Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Kreiss, S., Bertoni, L., Alahi, A.: Openpifpaf: Composite fields for semantic keypoint detection and spatio-temporal association. IEEE Transactions on Intelligent Transportation Systems, 1–14 (2021) https://doi.org/10.1109/TITS.2021.3124981 Lan et al. [2012] Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1549–1562 (2012) https://doi.org/10.1109/TPAMI.2011.228 Amer et al. [2014] Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Amer, M.R., Lei, P., Todorovic, S.: Hirf: Hierarchical random field for collective activity recognition in videos. In: European Conference on Computer Vision, pp. 572–585 (2014). Springer Zhu et al. [2013] Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2491–2498 (2013). https://doi.org/10.1109/CVPR.2013.322 Choi et al. [2009] Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Choi, W., Shahid, K., Savarese, S.: What are they doing? : Collective activity classification using spatio-temporal relationship among people. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 1282–1289 (2009) Belongie et al. [2002] Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002) https://doi.org/10.1109/34.993558 Wu et al. [2021] Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Wu, L.-F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.-X.: A comprehensive review of group activity recognition in videos. International Journal of Automation and Computing 18, 334–350 (2021) Ramanathan et al. [2016] Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.332 Tsunoda et al. [2017] Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Tsunoda, T., Komori, Y., Matsugu, M., Harada, T.: Football action recognition using hierarchical lstm, pp. 155–163 (2017). https://doi.org/10.1109/CVPRW.2017.25 Qi et al. [2019] Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Qi, M., Wang, Y., Qin, J., Li, A., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity and individual action recognition. IEEE Transactions on Circuits and Systems for Video Technology PP, 1–1 (2019) https://doi.org/10.1109/TCSVT.2019.2894161 Azar et al. [2018] Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Azar, S.M., Atigh, M.G., Nickabadi, A.: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition (2018) Kay et al. [2017] Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The Kinetics Human Action Video Dataset (2017) Jiang et al. [2014] Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Jiang, Y.-G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS Challenge: Action Recognition with a Large Number of Classes. http://crcv.ucf.edu/THUMOS14/ (2014) Soomro et al. [2012] Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Soomro, K., Zamir, A.R., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild (2012) Kuehne et al. [2011] Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563 (2011). https://doi.org/10.1109/ICCV.2011.6126543 Karpathy et al. [2014] Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014). https://doi.org/10.1109/CVPR.2014.223 Wu et al. [2022] Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Wu, D., Zhao, H., Bao, X., Wildes, R.P.: Sports video analysis on large-scale data. In: ECCV (2022) Association [1983] Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Association, A.A.N.: Netball Official Rules / Authorised by All Australia Netball Association. All Australia Netball Association, ??? (1983) Jin et al. [2017] Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Jin, S., Ma, X., Han, Z., Wu, Y., Yang, W., Liu, W., Qian, C., Ouyang, W.: Towards multi-person pose tracking: Bottom-up and top-down methods. In: ICCV Posetrack Workshop, vol. 2, p. 7 (2017) Azar et al. [2019] Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. CoRR abs/1904.03308 (2019) arXiv:1904.03308 Franco et al. [2020] Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Franco, A., Magnani, A., Maio, D.: A multimodal approach for human activity recognition based on skeleton and rgb data. Pattern Recognition Letters 131, 293–299 (2020) Hara et al. [2017] Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017) Cao et al. [2019] Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019) Pishchulin et al. [2016] Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) He et al. [2017] He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-cnn. 2017 IEEE International Conference on Computer Vision (ICCV) (2017) https://doi.org/10.1109/iccv.2017.322 Yosinski et al. [2014] Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Advances in neural information processing systems 27 (2014) Ma et al. [2018] Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018) Lin et al. [2014] Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. Lecture Notes in Computer Science, 740–755 (2014) https://doi.org/10.1007/978-3-319-10602-1_48 Kingma and Ba [2014] Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Kingma, D., Ba, J.: Adam: A method for stochastic optimization. International Conference on Learning Representations (2014) Paszke et al. [2019] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019) Sendo and Ukita [2019] Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Sendo, K., Ukita, N.: Heatmapping of people involved in group activities. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019). https://doi.org/10.23919/MVA.2019.8757971 Newell et al. [2016] Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. Lecture Notes in Computer Science, 483–499 (2016) https://doi.org/10.1007/978-3-319-46484-8_29 He et al. [2016] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) https://doi.org/10.1109/cvpr.2016.90 Fortun et al. [2015] Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015) Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)
- Fortun, D., Bouthemy, P., Kervrann, C.: Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134, 1–21 (2015)