Weighted boxes fusion: Ensembling boxes from different object detection models (1910.13302v3)

Published 29 Oct 2019 in cs.CV

Abstract: In this work, we present a novel method for combining predictions of object detection models: weighted boxes fusion. Our algorithm utilizes confidence scores of all proposed bounding boxes to constructs the averaged boxes. We tested method on several datasets and evaluated it in the context of the Open Images and COCO Object Detection tracks, achieving top results in these challenges. The source code is publicly available at https://github.com/ZFTurbo/Weighted-Boxes-Fusion

Citations (47)

View on Semantic Scholar

Summary

The paper introduces Weighted Boxes Fusion, an algorithm that fuses bounding box predictions using confidence scores from multiple detection models.
It outperforms traditional techniques like Non-Maximum Suppression, achieving a mAP of 56.1 on the COCO validation set.
The method is particularly effective in high-stakes applications such as autonomous driving and medical imaging, driving future advances in object detection.

Weighted Boxes Fusion: A Novel Approach for Object Detection Ensembles

The paper "Weighted Boxes Fusion: Ensembling Boxes from Different Object Detection Models" addresses the enhancement of object detection accuracy by proposing an innovative method for combining outputs from multiple detection models. The authors, Solovyev, Wang, and Gabruseva, focus on developing a method to effectively fuse bounding box predictions from diverse models, capitalizing on the strengths of each to yield superior detection performance.

Methodological Insights

The authors introduce Weighted Boxes Fusion (WBF), an algorithm designed to integrate bounding box predictions more effectively than traditional methods like Non-Maximum Suppression (NMS) and its variants. The cornerstone of WBF is its utilization of confidence scores associated with the predicted bounding boxes, enabling the construction of averaged boxes that incorporate contributions from all model outputs. This contrasts with NMS, which discards overlapping predictions based on a strict IoU threshold, potentially eliminating valuable insights from model outputs.

WBF's procedure involves several key steps:

Aggregation and Sorting: Predictions from different models are aggregated and sorted by confidence scores.
Clustering and Matching: Predicted boxes are clustered based on IoU overlap. If a box does not match an existing cluster, it initiates a new cluster.
Fusion Calculations: Bounding boxes are fused using a weighted average, prioritizing those with higher confidence scores.
Confidence Rescaling: Final confidence scores are adjusted based on the number of models contributing to the cluster, ensuring robustness across diverse predictions.

Experimental Evaluation

The proposed method was validated on major datasets, including Open Images and MS COCO, showing notable improvements. Specifically, in ensemble tasks, WBF significantly outperformed NMS, soft-NMS, and Non-Maximum Weighted (NMW) methods. On the COCO validation set, WBF reached a mAP of 56.1, placing it among the leading results on the COCO leaderboard.

Implications and Future Directions

Practical Implications: WBF is particularly valuable in scenarios where computational latency is less of a concern than accuracy. Examples include autonomous driving and medical imaging, where prediction accuracy can directly impact safety and diagnostic precision.

Theoretical Implications: The introduction of WBF challenges conventional suppression methodologies by suggesting that retaining and integrating overlapping predictions may enhance model outputs, particularly in varied or ambiguous detection environments.

Future Research: The paper opens avenues for adapting WBF to scenarios involving 3D object detection, as evidenced by its successful application in the Waymo and Lyft challenges. Further research might explore optimizing computational efficiency, as WBF's processing time exceeds that of traditional suppression techniques.

Conclusion

The Weighted Boxes Fusion technique represents a significant contribution to ensemble-based object detection. By leveraging confidence scores and averaging predictions, WBF enhances detection accuracy, providing a compelling alternative to conventional suppression approaches. The method's application to large-scale datasets and challenges underscores its potential utility in advancing object detection tasks across various domains. As the field progresses, WBF may become a foundational approach for model fusion strategies, encouraging further innovation in object detection methodologies.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Authors (3)

GitHub

GitHub - ZFTurbo/Weighted-Boxes-Fusion: Set of methods to ensemble boxes from different object detection models, including implementation of "Weighted boxes fusion (WBF)" method. (1,778 stars)