Inference-Time Selective Debiasing (2407.19345v2)

Published 27 Jul 2024 in cs.CL and cs.AI

Abstract: We propose selective debiasing -- an inference-time safety mechanism that aims to increase the overall quality of models in terms of prediction performance and fairness in the situation when re-training a model is prohibitive. The method is inspired by selective prediction, where some predictions that are considered low quality are discarded at inference time. In our approach, we identify the potentially biased model predictions and, instead of discarding them, we debias them using LEACE -- a post-processing debiasing method. To select problematic predictions, we propose a bias quantification approach based on KL divergence, which achieves better results than standard UQ methods. Experiments with text classification datasets demonstrate that selective debiasing helps to close the performance gap between post-processing methods and at-training and pre-processing debiasing techniques.

Authors (5)

Gleb Kuzmin (7 papers)
Ivan Smirnov (23 papers)
Timothy Baldwin (125 papers)
Artem Shelmanov (29 papers)
Neemesh Yadav (7 papers)

Citations (1)

View on Semantic Scholar

Summary

The paper introduces a novel inference-time debiasing method that selectively targets biased predictions using LEACE post-processing.
It leverages a KL divergence-based bias quantification criterion to dynamically balance accuracy and fairness in ML inferences.
Experimental results demonstrate improved fairness metrics and competitive accuracy compared to resource-intensive debiasing methods.

Inference-Time Selective Debiasing

The paper titled "Inference-Time Selective Debiasing" presents a novel approach for improving both the prediction performance and fairness of ML models during the inference phase, without the need for model re-training. The authors introduce an innovative inference-time mechanism termed "selective debiasing." This work is particularly relevant in scenarios where access to complete training data or the ability to re-train models is restrictive.

Key Contributions

Selective Debiasing: The proposed method targets only a subset of the model's predictions that are identified as potentially biased. Instead of discarding these predictions, as in selective classification, the method applies LEACE—a post-processing debiasing technique—to debias these specific instances.
Bias Quantification Criterion: The authors propose a bias quantification criterion based on the Kullback-Leibler (KL) divergence between the model's standard and debiased predictions. This criterion systematically identifies and rectifies the most biased predictions, aiming to minimize the fairness gap across different groups.

Methodology

Selective Debiasing

The methodology builds on the foundation of selective classification while extending it to enhance fairness. Traditional selective classification retains high-quality predictions and discards low-confidence predictions. In contrast, selective debiasing retains high-quality predictions but applies debiasing algorithms to selectively improve the fairness of potentially biased low-quality predictions.

Bias Quantification

The central innovation here is the use of KL divergence for bias quantification. KL divergence offers a robust measure of the difference between the original model's predictions and the predictions after applying LEACE. The higher the divergence, the more biased an instance is deemed to be. This selection mechanism allows the process to dynamically balance the trade-off between accuracy and fairness by adjusting the threshold of applicable instances.

Post-Processing Techniques

The paper evaluates this selective debiasing mechanism using two prominent post-processing debiasing techniques:

Iterative Null-space Projection (INLP): Iteratively removes biases by projecting them onto irrelevant subspaces while retaining the informational integrity of the remaining data.
LEAst-squares Concept Erasure (LEACE): Applies orthogonal projections to erase specific concepts (e.g., gender or race) from the model's representations.

Experimental Results

The experiments are conducted on text classification datasets like MOJI and BIOS-2. The metrics used to evaluate the effectiveness include accuracy, equal opportunity fairness, Distance to the Optimal point (DTO), and the Fairness F-score (FF). The results indicate that:

Selective debiasing based on KL divergence significantly improves the joint fairness and performance metrics compared to standard inference-time debiasing methods.
The proposed KL-based scoring criterion outperforms traditional Uncertainty Quantification (UQ) techniques, notably in the contexts of both balanced and imbalanced datasets.
Selective debiasing using LEACE achieves competitive or superior results relative to more resource-intensive at-training and pre-processing debiasing methods.

Implications and Future Work

Practical Implications

The primary practical implication of this research is the provision of a flexible and computationally efficient mechanism to enhance the fairness of ML models inferences without the need for retraining. This is crucial for applications where data availability or computational resources are limited.

Theoretical Implications

Theoretically, the work sets a precedent for integrating selective prediction and debiasing into a unified framework. By leveraging KL divergence for bias quantification, the authors provide a robust and theoretically sound mechanism to systematically address instance-level biases.

Future Developments

Future research can explore:

Expanding this approach to other fairness criteria beyond group fairness.
Exploring additional bias quantification methods and their integration with other debiasing techniques.
Implementing this approach in real-time ML systems to evaluate its efficacy and scalability in dynamic environments.

In conclusion, the paper "Inference-Time Selective Debiasing" introduces a pragmatic approach to enhancing model fairness during inference, effectively bridging a gap in the current landscape of ML fairness methodologies. The empirical results validate the potential of selective debiasing as a versatile tool for constructing fairer predictive models in resource-constrained settings.

PDF Markdown

Related Papers

YouTube

Show All Videos