Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A unified weighting framework for evaluating nearest neighbour classification (2311.16872v3)

Published 28 Nov 2023 in cs.LG and stat.ML

Abstract: We present the first comprehensive and large-scale evaluation of classical (NN), fuzzy (FNN) and fuzzy rough (FRNN) nearest neighbour classification. We standardise existing proposals for nearest neighbour weighting with kernel functions, applied to the distance values and/or ranks of the nearest neighbours of a test instance. In particular, we show that the theoretically optimal Samworth weights converge to a kernel. Kernel functions are closely related to fuzzy negation operators, and we propose a new kernel based on Yager negation. We also consider various distance and scaling measures, which we show can be related to each other. Through a systematic series of experiments on 85 real-life classification datasets, we find that NN, FNN and FRNN all perform best with Boscovich distance, and that NN and FRNN perform best with a combination of Samworth rank- and distance-weights and scaling by the mean absolute deviation around the median ($r_1$), the standard deviation ($r_2$) or the semi-interquartile range ($r_{\infty}*$), while FNN performs best with only Samworth distance-weights and $r_1$- or $r_2$-scaling. However, NN achieves comparable performance with Yager-$\frac{1}{2}$ distance-weights, which are simpler to implement than a combination of Samworth distance- and rank-weights. Finally, FRNN generally outperforms NN, which in turn performs systematically better than FNN.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. E. Fix and J. Hodges, Jr, “Discriminatory analysis — nonparametric discrimination: Consistency properties,” USAF School of Aviation Medicine, Randolph Field, Texas, Technical report 21-49-004, 1951.
  2. J. M. Keller, M. R. Gray, and J. A. Givens, “A fuzzy k𝑘kitalic_k-nearest neighbor algorithm,” IEEE Trans. Syst., Man, Cybern., no. 4, pp. 580–585, 1985.
  3. R. Jensen and C. Cornelis, “A new approach to fuzzy-rough nearest neighbour classification,” in Proc. 6th Int. Conf. Rough Sets Current Trends Comput., 2008, pp. 310–319.
  4. R. J. Samworth, “Optimal weighted nearest neighbour classifiers,” Ann. Statist., pp. 2733–2763, 2012.
  5. G. S. Watson, “Smooth regression analysis,” Sankhyā: Indian J. Statist., Ser. A, pp. 359–372, 1964.
  6. R. M. Royall, “A class of non-parametric estimates of a smooth regression function.” Ph.D. dissertation, 1966.
  7. D. Shepard, “A two-dimensional interpolation function for irregularly-spaced data,” in Proc. 1968 23rd ACM Nat. Conf., 1968, pp. 517–524.
  8. M. Rosenblatt, “Remarks on some nonparametric estimates of a density function,” Ann. Math. Statist., pp. 832–837, 1956.
  9. S. A. Dudani, “An experimental study of moment methods for automatic identification of three-dimensional objects from television images,” Ph.D. dissertation, The Ohio State University, 1973.
  10. ——, “The distance-weighted k𝑘kitalic_k-nearest-neighbor rule,” IEEE Trans. Syst., Man, Cybern., vol. 6, no. 4, pp. 325–327, 1976.
  11. C. J. Stone, “Consistent nonparametric regression,” Ann. Statist., pp. 595–620, 1977.
  12. J. Gou, T. Xiong, and Y. Kuang, “A novel weighted voting for k-nearest neighbor rule.” J. Comput., vol. 6, no. 5, pp. 833–840, 2011.
  13. T.-L. Pao, Y.-T. Chen, J.-H. Yeh, Y.-M. Cheng, and Y.-Y. Lin, “A comparative study of different weighting schemes on KNN-based emotion recognition in Mandarin speech,” in 3rd Int. Conf. Intell. Comput., 2007, pp. 997–1005.
  14. R. N. Shepard, “Toward a universal law of generalization for psychological science,” Science, vol. 237, no. 4820, pp. 1317–1323, 1987.
  15. J. Zavrel, “An empirical re-examination of weighted voting for k𝑘kitalic_k-NN,” in Proc. 7th Belg.-Dutch Conf. Mach. Learn., 1997, pp. 139–145.
  16. T. Bailey and A. Jain, “A note on distance-weighted k𝑘kitalic_k-nearest neighbor rules,” IEEE Trans. Syst., Man, Cybern., vol. 8, no. 4, pp. 311–313, 1978.
  17. J. E. Macleod, A. Luk, and D. M. Titterington, “A re-examination of the distance-weighted k-nearest neighbor classification rule,” IEEE Trans. Syst., Man, Cybern., vol. 17, no. 4, pp. 689–696, 1987.
  18. J. Gou, L. Du, Y. Zhang, T. Xiong et al., “A new distance-weighted k-nearest neighbor classifier,” J. Inf. Comput. Sci., vol. 9, no. 6, pp. 1429–1436, 2012.
  19. K. Hechenbichler and K. Schliep, “Weighted k𝑘kitalic_k-nearest-neighbor techniques and ordinal classification,” Ludwig-Maximilians-Universität München, Institut für Statistik, Sonderforschungsbereich 386, Paper 399, 2004.
  20. J. Derrac, S. García, and F. Herrera, “Fuzzy nearest neighbor algorithms: Taxonomy, experimental analysis and prospects,” Inf. Sci., vol. 260, pp. 98–119, 2014.
  21. D. Dubois and H. Prade, “Rough fuzzy sets and fuzzy rough sets,” Int. J. General Syst., vol. 17, no. 2-3, pp. 191–209, 1990.
  22. Z. Pawlak, “Rough sets,” ICS PAS, Report 431, 1981.
  23. C. Cornelis, N. Verbiest, and R. Jensen, “Ordered weighted average based fuzzy rough sets,” in Proc. 5th Int. Conf. Rough Set Knowl. Technol., 2010, pp. 78–85.
  24. O. U. Lenz, D. Peralta, and C. Cornelis, “Scalable approximate FRNN-OWA classification,” IEEE Trans. Fuzzy Syst., vol. 28, no. 5, pp. 929–938, 2020.
  25. N. Verbiest, C. Cornelis, and F. Herrera, “Selección de prototipos basada en conjuntos rugosos difusos,” in 16. Congreso Español sobre Tecnologías y Lógica Fuzzy, 2012, pp. 638–643.
  26. N. Verbiest, “Fuzzy rough and evolutionary approaches to instance selection,” Ph.D. dissertation, Universiteit Gent, 2014.
  27. S. Vluymans, N. Mac Parthaláin, C. Cornelis, and Y. Saeys, “Weight selection strategies for ordered weighted average based fuzzy rough sets,” Inf. Sci., vol. 501, pp. 155–171, 2019.
  28. D. Wettschereck, “A study of distance-based machine learning algorithms,” Ph.D. dissertation, Oregon State University, 1994.
  29. M. Higashi and G. J. Klir, “On measures of fuzziness and fuzzy complements,” Int. J. General Syst., vol. 8, no. 3, pp. 169–180, 1982.
  30. M. Sugeno, “Constructing fuzzy measure and grading similarity of patterns by fuzzy integral,” Trans. Soc. Instrum. Control Engineers, vol. 9, no. 3, pp. 361–368, 1973.
  31. R. R. Yager, “On a general class of fuzzy connectives,” Fuzzy Sets Syst., vol. 4, no. 3, pp. 235–242, 1980.
  32. S. Holm, “A simple sequentially rejective multiple test procedure,” Scand. J. Statist., vol. 6, no. 2, pp. 65–70, 1979.

Summary

We haven't generated a summary for this paper yet.