Better Aggregation in Test-Time Augmentation (2011.11156v2)

Published 23 Nov 2020 in cs.CV

Abstract: Test-time augmentation -- the aggregation of predictions across transformed versions of a test input -- is a common practice in image classification. Traditionally, predictions are combined using a simple average. In this paper, we present 1) experimental analyses that shed light on cases in which the simple average is suboptimal and 2) a method to address these shortcomings. A key finding is that even when test-time augmentation produces a net improvement in accuracy, it can change many correct predictions into incorrect predictions. We delve into when and why test-time augmentation changes a prediction from being correct to incorrect and vice versa. Building on these insights, we present a learning-based method for aggregating test-time augmentations. Experiments across a diverse set of models, datasets, and augmentations show that our method delivers consistent improvements over existing approaches.

Authors (4)

Divya Shanmugam (16 papers)
Davis Blalock (10 papers)
Guha Balakrishnan (42 papers)
John Guttag (42 papers)

Citations (128)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Better Aggregation in Test-Time Augmentation (2011.11156v2)

Summary

Related Papers