Bias, diversity, and challenges to fairness in classification and automated text analysis. From libraries to AI and back

Published 7 Mar 2023 in cs.CY and cs.AI | (2303.07207v1)

Abstract: Libraries are increasingly relying on computational methods, including methods from AI. This increasing usage raises concerns about the risks of AI that are currently broadly discussed in scientific literature, the media and law-making. In this article we investigate the risks surrounding bias and unfairness in AI usage in classification and automated text analysis within the context of library applications. We describe examples that show how the library community has been aware of such risks for a long time, and how it has developed and deployed countermeasures. We take a closer look at the notion of '(un)fairness' in relation to the notion of 'diversity', and we investigate a formalisation of diversity that models both inclusion and distribution. We argue that many of the unfairness problems of automated content analysis can also be regarded through the lens of diversity and the countermeasures taken to enhance diversity.