Model Patching: Closing the Subgroup Performance Gap with Data Augmentation (2008.06775v1)

Published 15 Aug 2020 in cs.LG, cs.AI, cs.CV, and stat.ML

Abstract: Classifiers in machine learning are often brittle when deployed. Particularly concerning are models with inconsistent performance on specific subgroups of a class, e.g., exhibiting disparities in skin cancer classification in the presence or absence of a spurious bandage. To mitigate these performance differences, we introduce model patching, a two-stage framework for improving robustness that encourages the model to be invariant to subgroup differences, and focus on class information shared by subgroups. Model patching first models subgroup features within a class and learns semantic transformations between them, and then trains a classifier with data augmentations that deliberately manipulate subgroup features. We instantiate model patching with CAMEL, which (1) uses a CycleGAN to learn the intra-class, inter-subgroup augmentations, and (2) balances subgroup performance using a theoretically-motivated subgroup consistency regularizer, accompanied by a new robust objective. We demonstrate CAMEL's effectiveness on 3 benchmark datasets, with reductions in robust error of up to 33% relative to the best baseline. Lastly, CAMEL successfully patches a model that fails due to spurious features on a real-world skin cancer dataset.

Citations (110)

View on Semantic Scholar

Summary

The paper introduces a two-stage framework that learns inter-subgroup transformations and uses data augmentation to close performance gaps.
It leverages a CycleGAN to enforce subgroup invariance, reducing spurious feature reliance and improving robustness by up to 11.7%.
Theoretical analysis bounds mutual information between subgroup features and outputs, with validations across multiple benchmark datasets.

Model Patching: Closing the Subgroup Performance Gap with Data Augmentation

The paper, "Model Patching: Closing the Subgroup Performance Gap with Data Augmentation," presents a novel machine learning methodology aimed at addressing the performance disparities experienced by classifiers across different subgroups within a class. The research outlines the deficiencies of standard machine learning models that, while optimizing for average performance, often inaccurately predict for underrepresented subgroups leading to potential biases and spurious correlations. This is an important concern particularly in applications like skin cancer detection, where classifiers may erroneously associate benign status with confounding features like bandages visible in training images.

Model Patching Framework

The authors introduce a two-stage framework called model patching designed to enhance robustness by ensuring that classifiers are invariant to subgroup-specific variations. This approach involves:

Learning Inter-Subgroup Transformations: This stage utilizes a CycleGAN to learn intra-class, inter-subgroup semantic transformations that aim to modify subgroup characteristics while preserving class label semantics.
Training with Data Augmentations: Leveraging these learned transformations, the framework augments the training dataset, enabling the classifier to become invariant to these variations, enhancing subgroup robustness.

The instantiation of this framework, termed CAMEL (CycleGAN Augmented Model Patching), incorporates subgroup consistency regularization to enforce invariance, accompanied by a new robust objective targeted at balancing subgroup performance.

Theoretical Analysis

The paper offers a theoretical underpinning justifying this methodology by modeling the data generation process that separates subgroup-specific features from class-specific features. The theoretical analysis demonstrates that the model patching framework effectively bounds the mutual information between subgroup information and classifier output, thereby enforcing subgroup invariance. This is validated through:

Introducing domain translation transformations that represent the subgroup membership changes across coupled examples.
Enforcing a consistency loss tailored to subgroup conditions that promote robustness in predictions across these augmented examples.

Numerical Results

CAMEL demonstrates substantial improvements in subgroup performance gaps across benchmark datasets including MNIST-Correlation, CelebA-Undersampled, Waterbirds, and ISIC skin cancer dataset. Empirically, CAMEL:

Achieves reductions in robust error by up to 33% relative to the best baseline on controlled setups like MNIST-Correlation and CelebA.
Demonstrates reduction in model's dependency on spurious features, exemplified in skin cancer detection applications where the robust accuracy is increased by 11.7%.

Implications and Future Directions

The implications of model patching are significant for enhancing the deployment of machine learning models in real-world applications where subgroup performance parity is critical. By robustly addressing subgroup variability, model patching could drastically improve discriminatory prevention in sensitive applications such as medicine, finance, and social sciences.

Future developments could explore:

Extending the model patching framework to other modalities such as text and audio that similarly require subgroup invariance.
Enhancing the learned transformations to capture more complex subgroup interactions.
Integrating with more sophisticated generative models, like StarGAN or Augmented CycleGAN, to further improve robustness across diverse applications.

In summary, the paper provides an insightful addition to the methodologies available for improving subgroup performance robustness in machine learning models, backed by strong theoretical modeling and empirical validation across various datasets and applications.

PDF Markdown

Related Papers

GitHub

GitHub - HazyResearch/model-patching: Model Patching: Closing the Subgroup Performance Gap with Data Augmentation (42 stars)