Variable Importance in Generalized Linear Models -- A Unifying View Using Shapley Values
Abstract: Variable importance in regression analyses is of considerable interest in a variety of fields. There is no unique method for assessing variable importance. However, a substantial share of the available literature employs Shapley values, either explicitly or implicitly, to decompose a suitable goodness-of-fit measure, in the linear regression model typically the classical $R2$. Beyond linear regression, there is no generally accepted goodness-of-fit measure, only a variety of pseudo-$R2$s. We formulate and discuss the desirable properties of goodness-of-fit measures that enable Shapley values to be interpreted in terms of relative, and even absolute, importance. We suggest to use a pseudo-$R2$ based on the Kullback-Leibler divergence, the Kullback-Leibler $R2$, which has a convenient form for generalized linear models and permits to unify and extend previous work on variable importance for linear and nonlinear models. Several examples are presented, using data from public health and insurance.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.