2000 character limit reached
NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to the Ego4D Moment Queries Challenge 2023 (2307.02025v1)
Published 5 Jul 2023 in cs.CV
Abstract: This report describes our submission to the Ego4D Moment Queries Challenge 2023. Our submission extends ActionFormer, a latest method for temporal action localization. Our extension combines an improved ground-truth assignment strategy during training and a refined version of SoftNMS at inference time. Our solution is ranked 2nd on the public leaderboard with 26.62% average mAP and 45.69% Recall@1x at tIoU=0.5 on the test set, significantly outperforming the strong baseline from 2023 challenge. Our code is available at https://github.com/happyharrycn/actionformer_release.
- Diagnosing error in temporal action detectors. In Eur. Conf. Comput. Vis., pages 256–272, 2018.
- Soft-NMS–improving object detection with one line of code. In ICCV, 2017.
- InternVideo-Ego4D: A pack of champion solutions to Ego4D challenges. arXiv preprint arXiv:2211.09529, 2022.
- Ota: Optimal transport assignment for object detection. In CVPR, 2021.
- Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430, 2021.
- Omnivore: A single model for many visual modalities. In CVPR, 2022.
- Egocentric video-language pretraining. In NeurIPS, 2022.
- Decoupled weight decay regularization. In ICLR, 2019.
- Where a strong backbone meets strong features–actionformer for ego4d moment queries challenge. arXiv preprint arXiv:2211.09074, 2022.
- Fcos: Fully convolutional one-stage object detection. In ICCV, 2019.
- ActionFormer: Localizing moments of actions with transformers. In ECCV, 2022.