New Proximity Estimate for Incremental Update of Non-uniformly Distributed Clusters (1310.6833v1)
Abstract: The conventional clustering algorithms mine static databases and generate a set of patterns in the form of clusters. Many real life databases keep growing incrementally. For such dynamic databases, the patterns extracted from the original database become obsolete. Thus the conventional clustering algorithms are not suitable for incremental databases due to lack of capability to modify the clustering results in accordance with recent updates. In this paper, the author proposes a new incremental clustering algorithm called CFICA(Cluster Feature-Based Incremental Clustering Approach for numerical data) to handle numerical data and suggests a new proximity metric called Inverse Proximity Estimate (IPE) which considers the proximity of a data point to a cluster representative as well as its proximity to a farthest point in its vicinity. CFICA makes use of the proposed proximity metric to determine the membership of a data point into a cluster.