- The paper introduces IF-DFM, a framework that uses influence functions to adjust CVR models for delayed conversions.
- It reformulates the inverse Hessian-vector product as an optimization problem, enabling scalable and efficient parameter updates.
- Experimental results on Criteo and Taobao datasets show improved AUC, PRAUC, and Log Loss compared to traditional methods.
Delayed Feedback Modeling with Influence Functions
The discussed paper explores the challenge of delayed feedback in conversion rate (CVR) prediction models within online advertising. In the cost-per-conversion (CPA) model, predicting CVR accurately is pivotal for optimizing revenue, as advertisers are charged only for conversions post user interactions. Due to delayed feedback, conversions can occur well after initial user clicks, leading to incomplete data and bias in model training.
Proposed Framework
The paper introduces the Influence Function-empowered Delayed Feedback Modeling (IF-DFM), a method designed to model the impact of delayed conversions on CVR predictions. The framework leverages influence functions to estimate the effects of new and delayed conversions on model parameters, facilitating efficient updates without the need for full model retraining.
Figure 1: The framework of offline CVR methods, online CVR methods, and IF-DFM.
The framework compares offline CVR methods, online methods, and the proposed IF-DFM approach. Offline methods rely on static historical data and often do not adapt well to dynamic shifts in user interests. In contrast, online methods partially update models based on observed data, which can be repetitive and inefficient.
Influence Functions and Delayed Feedback
Influence functions, traditionally used in robust statistics, estimate the impact of data perturbations on model parameters, providing a mathematically enriched way to adjust models for newly arrived feedback without retraining. IF-DFM reformulates the inverse Hessian-vector product, essential to influence function calculations, as an optimization problem to ensure scalability and effectiveness.
Figure 2: An illustration of the delayed feedback problem in CVR tasks.
Methodology
IF-DFM addresses two key issues: label reversal and integration of new data. Label reversal occurs when samples initially labeled as negative eventually convert, necessitating corrections. Newly arrived data, indicative of recent user behavior, needs to be integrated efficiently to keep the models adaptive.
The paper employs a finite-sum quadratic optimization problem to address these perturbations. This approach allows using stochastic optimization techniques like SGD and its variants to efficiently compute parameter changes, bypassing the need for full retraining.
Experimental Results
The experimental evaluation was conducted on Criteo and Taobao datasets, demonstrating the superior performance of IF-DFM compared to existing methods in both offline and online settings. The results show consistent outperformance in metrics such as AUC, PRAUC, and Log Loss.

Figure 3: Offline experimental results on Taobao dataset.
Figure 4: Online experimental results on Criteo dataset.
The approach showed notable improvements in adaptability to dynamic user preferences, thanks to its efficient integration of influence functions for real-time feedback modeling.
Conclusion
IF-DFM offers a robust approach to mitigating delayed feedback in CVR prediction by efficiently incorporating influence functions. By directly estimating the impact of new and corrected data, this framework ensures timely updates to model parameters without the computational overhead of full retraining, displaying both efficacy in performance metrics and adaptability in dynamic advertising environments. Future developments may focus on enhancing influence estimation processes and deploying these methods in real-world applications via A/B testing.