Component-level interpretability in HOBZ-BART

Develop component-level interpretability methods for the HOBZ-BART shared-forest model that attribute and explain covariate effects and interactions separately for the three outcome components—Pr(Y=1) for persistent heavy drinking, Pr(Y=0 | Y<1) for complete abstinence, and the Beta-regression mean f_b(x) for Y in (0,1)—to clarify how distinct covariate patterns influence abstinence, partial drinking, and heavy use.

Background

HOBZ-BART models bounded semicontinuous outcomes with mass at both boundaries by decomposing the response into three components: the probability of heavy use (point mass at 1), the probability of abstinence (point mass at 0 conditional on not being 1), and the interior Beta-distributed response. A shared BART forest captures nonlinearities and interactions across all components.

While this structure improves predictive performance and aligns with clinical reasoning, understanding which covariates drive each component—and how interactions differ across abstinence, partial drinking, and heavy use—remains unresolved. The authors explicitly note that interpretability at the component level is an open challenge, motivating the development of component-specific explanations such as variable-importance or interaction-aware summaries.

References

Third, while HOBZ-BART effectively captures nonlinear interactions through the shared BART structure, interpretability at the component level remains an open challenge.