Efficient construction of the lattice of frequent closed patterns and simultaneous extraction of generic bases of rules (1312.1558v2)
Abstract: In the last few years, the amount of collected data, in various computer science applications, has grown considerably. These large volumes of data need to be analyzed in order to extract useful hidden knowledge. This work focuses on association rule extraction. This technique is one of the most popular in data mining. Nevertheless, the number of extracted association rules is often very high, and many of them are redundant. In this paper, we propose a new algorithm, called PRINCE. Its main feature is the construction of a partially ordered structure for extracting subsets of association rules, called generic bases. Without loss of information these subsets form representation of the whole association rule set. To reduce the cost of such a construction, the partially ordered structure is built thanks to the minimal generators associated to frequent closed patterns. The closed ones are simultaneously derived with generic bases thanks to a simple bottom-up traversal of the obtained structure. The experimentations we carried out in benchmark and "worst case" contexts showed the efficiency of the proposed algorithm, compared to algorithms like CLOSE, A-CLOSE and TITANIC.