Null-sampling for Interpretable and Fair Representations (2008.05248v1)

Published 12 Aug 2020 in cs.LG and stat.ML

Abstract: We propose to learn invariant representations, in the data domain, to achieve interpretability in algorithmic fairness. Invariance implies a selectivity for high level, relevant correlations w.r.t. class label annotations, and a robustness to irrelevant correlations with protected characteristics such as race or gender. We introduce a non-trivial setup in which the training set exhibits a strong bias such that class label annotations are irrelevant and spurious correlations cannot be distinguished. To address this problem, we introduce an adversarially trained model with a null-sampling procedure to produce invariant representations in the data domain. To enable disentanglement, a partially-labelled representative set is used. By placing the representations into the data domain, the changes made by the model are easily examinable by human auditors. We show the effectiveness of our method on both image and tabular datasets: Coloured MNIST, the CelebA and the Adult dataset.

PDF Abstract

Null-Sampling for Interpretable and Fair Representations

The paper "Null-sampling for Interpretable and Fair Representations" by Thomas Kehrenberg, Myles Bartlett, Oliver Thomas, and Novi Quadrianto addresses pertinent issues in machine learning systems concerning algorithmic fairness and interpretability. The authors propose a method for learning invariant representations that facilitate interpretability while promoting fairness, particularly in tasks involving classification where data may present biased or spurious correlations.

Background and Methodology

The paper spotlights the challenge of biases in training datasets, where class labels may be muddled by spurious correlations originating from protected characteristics such as race or gender. This is central to the quest for developing machine learning models that are both fair and interpretable. The authors introduce a novel adversarial model equipped with a null-sampling procedure designed to produce invariant representations within the data domain. These invariant representations selectively retain high-level correlations pertinent to class labels while mitigating biases related to protected characteristics.

To facilitate disentanglement of relevant features from biased data, the authors employ a partially-labelled representative set, ensuring that changes instigated by the model remain accessible for examination by human auditors. This approach significantly advances the interpretability of the model's outputs and decisions.

Numerical Results and Dataset Analysis

The effectiveness of the proposed method is thoroughly illustrated via experimentation on a range of datasets, including Coloured MNIST, the CelebA, and the Adult dataset. These benchmark datasets are utilized to validate the procedure, demonstrating the model's capability to generalize across both image and tabular data. The paper provides compelling quantitative results, showcasing the method's efficacy in reducing bias while maintaining accurate class label predictions.

Implications and Future Directions

The implications of this research extend beyond immediate practical applications. By embedding fairness into the representation learning process, the authors contribute significantly to the growing field of ethical AI development. This methodology not only aids in producing more equitable AI models but also lays a foundation for future work in refining interpretability and fairness simultaneously within complex datasets.

Looking forward, further exploration is warranted in integrating these invariant representation techniques into larger, more diverse datasets and in applying these models across additional domains beyond image and tabular data. The theoretical advancements made here may inspire subsequent developments in AI aimed at balancing interpretability with fairness across various machine learning applications.

In conclusion, this paper presents a rigorously designed model addressing a core challenge in contemporary machine learning, offering insights into achieving fair and interpretable models through innovative use of invariant representation learning. The robust results and clear implications indicate a productive avenue for enhanced understanding and advancement in AI fairness and interpretability.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Thomas Kehrenberg (5 papers)
Myles Bartlett (4 papers)
Oliver Thomas (14 papers)
Novi Quadrianto (25 papers)

Citations (27)

View on Semantic Scholar

Null-sampling for Interpretable and Fair Representations (2008.05248v1)