Using Individualized Treatment Effects to Assess Treatment Effect Heterogeneity

Published 2 Feb 2025 in stat.AP and stat.ME | (2502.00713v1)

Abstract: Assessing treatment effect heterogeneity (TEH) in clinical trials is crucial, as it provides insights into the variability of treatment responses among patients, influencing important decisions related to drug development. Furthermore, it can lead to personalized medicine by tailoring treatments to individual patient characteristics. This paper introduces novel methodologies for assessing treatment effects using the individual treatment effect as a basis. To estimate this effect, we use a Double Robust (DR) learner to infer a pseudo-outcome that reflects the causal contrast. This pseudo-outcome is then used to perform three objectives: (1) a global test for heterogeneity, (2) ranking covariates based on their influence on effect modification, and (3) providing estimates of the individualized treatment effect. We compare our DR-learner with various alternatives and competing methods in a simulation study, and also use it to assess heterogeneity in a pooled analysis of five Phase III trials in psoriatic arthritis. By integrating these methods with the recently proposed WATCH workflow (Workflow to Assess Treatment Effect Heterogeneity in Drug Development for Clinical Trial Sponsors), we provide a robust framework for analyzing TEH, offering insights that enable more informed decision-making in this challenging area.

Abstract PDF Upgrade to Chat

Authors (6)

Summary

The paper presents the DR-learner approach that robustly estimates individualized treatment effects using a doubly robust framework.
It employs permutation tests and conditional inference to effectively test for treatment effect heterogeneity and identify key effect modifiers.
The method is validated via simulations and real-world Phase III trial data, demonstrating improved CATE estimation over competing methods.

Using Individualized Treatment Effects to Assess Treatment Effect Heterogeneity

Introduction

The paper discusses methodologies for evaluating treatment effect heterogeneity (TEH) in clinical trials using individualized treatment effects (ITE) as the foundation. By focusing on ITE, the methodologies aim to provide insights into patient response variability, facilitating personalized treatment strategies. The core method introduced is the Double Robust (DR) learner, which estimates a pseudo-outcome reflecting causal contrasts. This pseudo-outcome is pivotal for performing three key objectives: conducting a global test for heterogeneity, ranking covariates by their effect modification influence, and estimating individualized treatment effects.

DR-Learner Implementation

The DR-learner utilizes a doubly robust approach, integrating outcome modeling with propensity score modeling. This ensures accurate Conditional Average Treatment Effect (CATE) estimates even if one model is misspecified. The implementation involves using cross-fitting techniques to mitigate overfitting risks, ensuring robust performance against model misspecification. The process involves training nuisance models to derive pseudo-outcomes, which are then used for CATE estimation.

Figure (Figure 1) provides an overview of the WATCH workflow, which integrates the DR-learner for TEH analysis.

Figure 1: Overview of WATCH workflow and the four main steps: (1) Analysis Planning, (2) Initial Data Analysis and Analysis Dataset Creation, (3) TEH Exploration, and (4) Multidisciplinary Assessment.

Objective 1: Global Test for Heterogeneity

To assess evidence against homogeneity, the paper suggests using conditional inference procedures within a permutation test framework. The independence test between pseudo-outcomes and covariates ensures accurate detection of TEH. The approach utilizes two statistics: maximum and quadratic, each suitable for different assumptions about covariate interactions. The permutation-based method adjusts for multiple testing, offering robust TEH assessments.

Figure (Figure 2) shows the comparison of the two methods for testing heterogeneity.

Figure 2: Comparison of the two methods for testing heterogeneity with respect to Objective 1(i)

Objective 2: Identification of Effect Modifiers

Identifying effect modifiers involves using permutation-based importance scores derived from conditional random forests. This method ensures unbiased selection regardless of variable type. By regressing covariates on pseudo-outcomes, the DR-learner ranks covariates based on their impact on the treatment effect. The process utilizes the {party} R package for constructing random forests with conditional inference trees, capturing interactions among covariates.

Figure (Figure 3) presents the method's unbiased performance under no TEH and its capability to identify true predictive biomarkers.

Figure 3: Comparison of two methods for deriving effect modifiers with respect to Objective 2(i)

Objective 3: Estimation of Individualized Treatment Effects

For CATE estimation, the DR-learner demonstrates flexibility and robustness. The pseudo-outcomes are regressed on covariates, ensuring accurate CATE predictions. Cross-fitting plays a crucial role in unbiased CATE estimation, mitigating overfitting. The paper evaluates multiple cross-fitting strategies in simulation studies, with the DR-learner consistently outperforming competing methods.

Figure (Figure 4) highlights the comparison of CATE estimation strategies, showcasing the DR-learner's superior performance.

Figure 4: Comparison of three learners for estimating individual treatment effect with respect to Objective 3.

Real-World Application and Results

The paper applies the DR-learner in a pooled analysis of five Phase III psoriatic arthritis trials. The DR-learner identified key effect modifiers similar to those found in previous studies, demonstrating its consistency and reliability in real-world scenarios. Interestingly, CRPSI and BD-2 were identified as significant modifiers, aligning with existing research findings.

Figure (Figure 5) illustrates the importance ranking of variables as captured by the DR-learner.

Figure 5: Variable importance ranking showing top variables as effect modifiers.

Conclusion

The comprehensive framework of DR-learner within the WATCH workflow offers robust methods for assessing TEH. Its ability to test for heterogeneity, identify effect modifiers, and estimate individualized treatment effects provides clinicians with tools to tailor treatments based on patient-specific characteristics. Future work could focus on adapting these methodologies to different clinical endpoints, such as survival or time-to-event outcomes. Through a rigorous cross-validation and simulation-based approach, the paper highlights the DR-learner's utility in both theoretical and practical applications in drug development and personalized medicine.

Markdown Report Issue