Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Minimal cost feature selection of data with normal distribution measurement errors (1211.2512v2)

Published 12 Nov 2012 in cs.AI and cs.LG

Abstract: Minimal cost feature selection is devoted to obtain a trade-off between test costs and misclassification costs. This issue has been addressed recently on nominal data. In this paper, we consider numerical data with measurement errors and study minimal cost feature selection in this model. First, we build a data model with normal distribution measurement errors. Second, the neighborhood of each data item is constructed through the confidence interval. Comparing with discretized intervals, neighborhoods are more reasonable to maintain the information of data. Third, we define a new minimal total cost feature selection problem through considering the trade-off between test costs and misclassification costs. Fourth, we proposed a backtracking algorithm with three effective pruning techniques to deal with this problem. The algorithm is tested on four UCI data sets. Experimental results indicate that the pruning techniques are effective, and the algorithm is efficient for data sets with nearly one thousand objects.

Citations (1)

Summary

We haven't generated a summary for this paper yet.