Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On Finding Bi-objective Pareto-optimal Fraud Prevention Rule Sets for Fintech Applications (2311.00964v3)

Published 2 Nov 2023 in cs.LG and q-fin.ST

Abstract: Rules are widely used in Fintech institutions to make fraud prevention decisions, since rules are highly interpretable thanks to their intuitive if-then structure. In practice, a two-stage framework of fraud prevention decision rule set mining is usually employed in large Fintech institutions; Stage 1 generates a potentially large pool of rules and Stage 2 aims to produce a refined rule subset according to some criteria (typically based on precision and recall). This paper focuses on improving the flexibility and efficacy of this two-stage framework, and is concerned with finding high-quality rule subsets in a bi-objective space (such as precision and recall). To this end, we first introduce a novel algorithm called SpectralRules that directly generates a compact pool of rules in Stage 1 with high diversity. We empirically find such diversity improves the quality of the final rule subset. In addition, we introduce an intermediate stage between Stage 1 and 2 that adopts the concept of Pareto optimality and aims to find a set of non-dominated rule subsets, which constitutes a Pareto front. This intermediate stage greatly simplifies the selection criteria and increases the flexibility of Stage 2. For this intermediate stage, we propose a heuristic-based framework called PORS and we identify that the core of PORS is the problem of solution selection on the front (SSF). We provide a systematic categorization of the SSF problem and a thorough empirical evaluation of various SSF methods on both public and proprietary datasets. On two real application scenarios within Alipay, we demonstrate the advantages of our proposed methodology over existing work.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. 2024. Github repo. https://github.com/ChengyaoWen/Pareto-Optimal-Rule-Subset-Selection
  2. Mining association rules between sets of items in large databases. In SIGMOD.
  3. J. Bader and E. Zitzler. 2011. HypE: An algorithm for fast hypervolume-based many-objective optimization. Evolutionary computation 19, 1 (2011), 45–76.
  4. SMS-EMOA: Multiobjective selection based on dominated hypervolume. European Journal of Operational Research 181, 3 (2007), 1653–1669.
  5. L. Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5–32.
  6. Classification and regression trees. Chapman and Hall/CRC.
  7. Clustering-based subset selection in evolutionary multiobjective optimization. In SMC.
  8. Fast Greedy Subset Selection From Large Candidate Solution Sets in Evolutionary Multiobjective Optimization. IEEE Transactions on Evolutionary Computation 26, 4 (2021), 750–764.
  9. P. Clark and T. Niblett. 1989. The CN2 induction algorithm. Machine learning 3, 4 (1989), 261–283.
  10. W.W. Cohen. 1995. Fast effective rule induction. In ICML.
  11. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6, 2 (2002), 182–197.
  12. The Hypervolume Indicator: Problems and Algorithms. arXiv preprint arXiv:2005.00515 (2020).
  13. Interpretable decision sets: A joint framework for description and prediction. In KDD.
  14. Many-objective evolutionary algorithms: A survey. ACM Computing Surveys (CSUR) 48, 1 (2015), 1–35.
  15. M. Li and X. Yao. 2019. Quality evaluation of solution sets in multiobjective optimisation: A survey. Comput. Surveys 52, 2 (2019), 1–38.
  16. An Adaptive Framework for Confidence-constraint Rule Set Learning Algorithm in Large Dataset. In CIKM.
  17. C. Molnar. 2020. Interpretable Machine Learning. Lulu. com.
  18. Equispaced Pareto front construction for constrained bi-objective optimization. Mathematical and Computer Modelling 57, 9-10 (2013), 2122–2131.
  19. Fanglue: An Interactive System for Decision Rule Crafting. Proceedings of the VLDB Endowment 16, 12 (2023), 4062–4065.
  20. J.R. Quinlan. 1993. C4. 5: programs for machine learning. Morgan Kaufmann Publishers.
  21. Benchmarking subset selection from large candidate solution sets in evolutionary multi-objective optimization. arXiv preprint arXiv:2201.06700 (2022).
  22. Benchmarking multi-and many-objective evolutionary algorithms under two optimization scenarios. IEEE Access 5 (2017), 19597–19619.
  23. C.J. Van Rijsbergen. 1979. Information Retrieval. Butterworth-Heinemann.
  24. G. Zhang and A. Gionis. 2020. Diverse Rule Sets. In KDD.
  25. Q. Zhang and H. Li. 2007. MOEA/D: A multiobjective evolutionary algorithm based on decomposition. IEEE Transactions on evolutionary computation 11, 6 (2007), 712–731.
  26. E. Zitzler. 1999. Evolutionary algorithms for multiobjective optimization: Methods and applications. Ph. D. Dissertation. ETH Zurich, Switzerland.
  27. SPEA2: Improving the strength Pareto evolutionary algorithm. Technical Report. ETHZ, Zürich, Switzerland.

Summary

We haven't generated a summary for this paper yet.