2000 character limit reached
A Nested Genetic Algorithm for Explaining Classification Data Sets with Decision Rules (2209.07575v1)
Published 23 Aug 2022 in cs.NE, cs.AI, and cs.LG
Abstract: Our goal in this paper is to automatically extract a set of decision rules (rule set) that best explains a classification data set. First, a large set of decision rules is extracted from a set of decision trees trained on the data set. The rule set should be concise, accurate, have a maximum coverage and minimum number of inconsistencies. This problem can be formalized as a modified version of the weighted budgeted maximum coverage problem, known to be NP-hard. To solve the combinatorial optimization problem efficiently, we introduce a nested genetic algorithm which we then use to derive explanations for ten public data sets.