Papers
Topics
Authors
Recent
Search
2000 character limit reached

Evaluating Transfer Learning Methods on Real-World Data Streams: A Case Study in Financial Fraud Detection

Published 29 Jul 2025 in q-fin.ST and cs.LG | (2508.02702v1)

Abstract: When the available data for a target domain is limited, transfer learning (TL) methods can be used to develop models on related data-rich domains, before deploying them on the target domain. However, these TL methods are typically designed with specific, static assumptions on the amount of available labeled and unlabeled target data. This is in contrast with many real world applications, where the availability of data and corresponding labels varies over time. Since the evaluation of the TL methods is typically also performed under the same static data availability assumptions, this would lead to unrealistic expectations concerning their performance in real world settings. To support a more realistic evaluation and comparison of TL algorithms and models, we propose a data manipulation framework that (1) simulates varying data availability scenarios over time, (2) creates multiple domains through resampling of a given dataset and (3) introduces inter-domain variability by applying realistic domain transformations, e.g., creating a variety of potentially time-dependent covariate and concept shifts. These capabilities enable simulation of a large number of realistic variants of the experiments, in turn providing more information about the potential behavior of algorithms when deployed in dynamic settings. We demonstrate the usefulness of the proposed framework by performing a case study on a proprietary real-world suite of card payment datasets. Given the confidential nature of the case study, we also illustrate the use of the framework on the publicly available Bank Account Fraud (BAF) dataset. By providing a methodology for evaluating TL methods over time and in realistic data availability scenarios, our framework facilitates understanding of the behavior of models and algorithms. This leads to better decision making when deploying models for new domains in real-world environments.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.