Papers
Topics
Authors
Recent
2000 character limit reached

SAFE: A Neural Survival Analysis Model for Fraud Early Detection

Published 12 Sep 2018 in cs.LG, cs.AI, cs.CR, and stat.ML | (1809.04683v2)

Abstract: Many online platforms have deployed anti-fraud systems to detect and prevent fraudulent activities. However, there is usually a gap between the time that a user commits a fraudulent action and the time that the user is suspended by the platform. How to detect fraudsters in time is a challenging problem. Most of the existing approaches adopt classifiers to predict fraudsters given their activity sequences along time. The main drawback of classification models is that the prediction results between consecutive timestamps are often inconsistent. In this paper, we propose a survival analysis based fraud early detection model, SAFE, which maps dynamic user activities to survival probabilities that are guaranteed to be monotonically decreasing along time. SAFE adopts recurrent neural network (RNN) to handle user activity sequences and directly outputs hazard values at each timestamp, and then, survival probability derived from hazard values is deployed to achieve consistent predictions. Because we only observe the user suspended time instead of the fraudulent activity time in the training data, we revise the loss function of the regular survival model to achieve fraud early detection. Experimental results on two real world datasets demonstrate that SAFE outperforms both the survival analysis model and recurrent neural network model alone as well as state-of-the-art fraud early detection approaches.

Citations (39)

Summary

  • The paper introduces a novel GRU-based survival analysis framework that consistently predicts fraud early using monotonically decreasing hazard rates.
  • The model improves detection accuracy and F1 scores over traditional methods by effectively processing time-varying user covariates.
  • It demonstrates a practical methodology for early fraud detection, paving the way for real-time monitoring on online platforms.

SAFE: A Neural Survival Analysis Model for Fraud Early Detection

Introduction

The paper "SAFE: A Neural Survival Analysis Model for Fraud Early Detection" introduces an innovative approach to combat the significant delay typically experienced in detecting fraudulent users on online platforms. Given the prevalent threat posed by such users, timely detection is crucial. The proposed Survival Analysis-based Fraud Early detection model (SAFE) addresses inconsistencies in prior classification models by ensuring consistent prediction through a monotonically decreasing survival probability alongside user activities over time.

Model Description and Advantages

SAFE leverages an RNN (specifically, a GRU) to process time-varying user covariates, generating hazard rates indicating the likelihood of a user being fraudulent. The model then derives survival probabilities from these hazard rates, which are essential for evaluating user behavior consistently over time. Unlike typical survival analysis models that require assumptions about the parametric distribution of survival times, SAFE's architecture does not make such assumptions, allowing it to better capture the nuances of real-world data and facilitating more accurate predictions. Figure 1

Figure 1: An RNN-based survival analysis model for fraud early detection

SAFE features a distinctive loss function specifically adapted for late response labels, an area where traditional survival models often falter. By modifying this function, SAFE can efficiently differentiate between fraudulent and non-fraudulent behavior based on suspension timing, successfully detecting fraudsters before the platform enacts a suspension.

Empirical Evaluation

The paper details the superior performance of SAFE in comparison to traditional models, including baseline approaches like SVM and CPH, as well as advanced approaches like M-LSTM. In datasets such as Twitter and Wikipedia, SAFE outshone these models with marked improvements in accuracy and F1 scores. Figure 2

Figure 2: Comparison of the survival analysis-based approach and classification-based approach for fraud early detection. Red square indicates that the user is predicted as fraudsters at time tt while the green circle indicates the user is predicted as normal.

Moreover, SAFE exhibited a higher percentage of early-detected fraudsters and a greater number of early-detected timestamps. These results underscore the model's efficacy in not only identifying fraudulent users but also doing so significantly ahead of when suspension would typically occur. Figure 3

Figure 3

Figure 3: Percentage of early detected fraudsters

Theoretical Implications and Future Work

This research propels the field of fraud detection forward by integrating survival analysis with RNN, which harmonizes temporal consistency with robust prediction accuracy. This hybrid model's ability to operate without fixed parametric assumptions ensures greater adaptability and applicability across a variety of datasets.

Future improvements may focus on refining the model to predict precise fraudulent activity times rather than just suspension times. Additionally, adapting SAFE for deployment in real-time monitoring systems could enhance its applicability across various online platforms, offering a rapid and reliable detection mechanism against emerging threats.

In conclusion, the SAFE model sets a new standard for fraud detection by synthesizing established analytical methods with state-of-the-art machine learning. Its empirical success demonstrates clear advantages over traditional models, making it a powerful tool in the fight against online fraud.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.