Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Progressive Multi-task Anti-Noise Learning and Distilling Frameworks for Fine-grained Vehicle Recognition (2401.14336v1)

Published 25 Jan 2024 in cs.CV and cs.AI

Abstract: Fine-grained vehicle recognition (FGVR) is an essential fundamental technology for intelligent transportation systems, but very difficult because of its inherent intra-class variation. Most previous FGVR studies only focus on the intra-class variation caused by different shooting angles, positions, etc., while the intra-class variation caused by image noise has received little attention. This paper proposes a progressive multi-task anti-noise learning (PMAL) framework and a progressive multi-task distilling (PMD) framework to solve the intra-class variation problem in FGVR due to image noise. The PMAL framework achieves high recognition accuracy by treating image denoising as an additional task in image recognition and progressively forcing a model to learn noise invariance. The PMD framework transfers the knowledge of the PMAL-trained model into the original backbone network, which produces a model with about the same recognition accuracy as the PMAL-trained model, but without any additional overheads over the original backbone network. Combining the two frameworks, we obtain models that significantly exceed previous state-of-the-art methods in recognition accuracy on two widely-used, standard FGVR datasets, namely Stanford Cars, and CompCars, as well as three additional surveillance image-based vehicle-type classification datasets, namely Beijing Institute of Technology (BIT)-Vehicle, Vehicle Type Image Data 2 (VTID2), and Vehicle Images Dataset for Make Model Recognition (VIDMMR), without any additional overheads over the original backbone networks. The source code is available at https://github.com/Dichao-Liu/Anti-noise_FGVR

Definition Search Book Streamline Icon: https://streamlinehq.com
References (67)
  1. J. Krause, M. Stark, J. Deng, and L. Fei-Fei, “3d object representations for fine-grained categorization,” in 4th International IEEE Workshop on 3D Representation and Recognition, 2013, pp. 554–561.
  2. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  3. Y. Xiang, Y. Fu, and H. Huang, “Global topology constraint network for fine-grained vehicle recognition,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 7, pp. 2918–2929, 2020.
  4. S. Kamkar and R. Safabakhsh, “Vehicle detection, counting and classification in various conditions,” IET Intelligent Transport Systems, vol. 10, no. 6, pp. 406–413, 2016.
  5. M. Liang, X. Huang, C.-H. Chen, X. Chen, and A. Tokuta, “Counting and classification of highway vehicles by regression analysis,” IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 5, pp. 2878–2888, 2015.
  6. M. Jaderberg, K. Simonyan, A. Zisserman et al., “Spatial transformer networks,” in Advances in neural information processing systems, 2015, pp. 2017–2025.
  7. Q. Hu, H. Wang, T. Li, and C. Shen, “Deep cnns with spatially weighted pooling for fine-grained car recognition,” IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 11, pp. 3147–3156, 2017.
  8. A. Boukerche and X. Ma, “A novel smart lightweight visual attention model for fine-grained vehicle recognition,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 13 846–13 862, 2022.
  9. A. Buades, B. Coll, and J.-M. Morel, “A review of image denoising algorithms, with a new one,” Multiscale modeling & simulation, vol. 4, no. 2, pp. 490–530, 2005.
  10. C. Huang and H.-H. Nien, “Multi chaotic systems based pixel shuffle for image encryption,” Optics communications, vol. 282, no. 11, pp. 2123–2127, 2009.
  11. T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of GANs for improved quality, stability, and variation,” in International Conference on Learning Representations, 2018, pp. 1–26.
  12. L. Yang, P. Luo, C. Change Loy, and X. Tang, “A large-scale car dataset for fine-grained categorization and verification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3973–3981.
  13. Z. Dong, Y. Wu, M. Pei, and Y. Jia, “Vehicle type classification using a semisupervised convolutional neural network,” IEEE transactions on intelligent transportation systems, vol. 16, no. 4, pp. 2247–2256, 2015.
  14. N. Boonsirisumpun and O. Surinta, “Fast and accurate deep learning architecture on vehicle type recognition,” Current Applied Science and Technology, pp. 1–16, 2022.
  15. M. Ali, M. A. Tahir, and M. N. Durrani, “Vehicle images dataset for make and model recognition,” Data in Brief, vol. 42, p. 108107, 2022.
  16. C. Zhou, M. Wu, and S.-K. Lam, “A unified multi-task learning architecture for fast and accurate pedestrian detection,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 2, pp. 982–996, 2022.
  17. A. Bruckstein, M. Lindenbaum, and M. Fischer, “On gabor contribution to image enhancement,” Comput. Methods Programs Biomed, vol. 27, pp. 1–8, 1994.
  18. J.-J. Huang and P. L. Dragotti, “Winnet: Wavelet-inspired invertible network for image denoising,” IEEE Transactions on Image Processing, vol. 31, pp. 4377–4392, 2022.
  19. R. Du, D. Chang, A. K. Bhunia, J. Xie, Z. Ma, Y.-Z. Song, and J. Guo, “Fine-grained visual classification via progressive multi-granularity training of jigsaw patches,” in European Conference on Computer Vision, 2020, pp. 153–168.
  20. T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4401–4410.
  21. Y. Wu, Y. Lin, X. Dong, Y. Yan, W. Bian, and Y. Yang, “Progressive learning for person re-identification with one example,” IEEE Transactions on Image Processing, vol. 28, no. 6, pp. 2872–2881, 2019.
  22. D. Liu, L. Zhao, Y. Wang, and J. Kato, “Learn from each other to classify better: Cross-layer mutual attention learning for fine-grained visual classification,” Pattern Recognition, vol. 140, p. 109550, 2023.
  23. T. Tong, G. Li, X. Liu, and Q. Gao, “Image super-resolution using dense skip connections,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 4799–4807.
  24. M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in European conference on computer vision, 2014, pp. 818–833.
  25. D. Ren, J. Li, M. Han, and M. Shu, “Scga-net: Skip connections global attention network for image restoration,” in Computer Graphics Forum, vol. 39, no. 7, 2020, pp. 507–518.
  26. P. L. Diéguez and V.-W. Soo, “Variational autoencoders for polyphonic music interpolation,” in 2020 International Conference on Technologies and Applications of Artificial Intelligence, 2020, pp. 56–61.
  27. A. P. Rifai, H. Aoyama, N. H. Tho, S. Z. M. Dawal, and N. A. Masruroh, “Evaluation of turned and milled surfaces roughness using convolutional neural network,” Measurement, vol. 161, p. 107860, 2020.
  28. P. Foret, A. Kleiner, H. Mobahi, and B. Neyshabur, “Sharpness-aware minimization for efficiently improving generalization,” in International Conference on Learning Representations, 2021, pp. 1–19.
  29. Y. Jiang, B. Neyshabur, H. Mobahi, D. Krishnan, and S. Bengio, “Fantastic generalization measures and where to find them,” in 8th International Conference on Learning Representations, 2020, pp. 1–33.
  30. G. K. Dziugaite and D. M. Roy, “Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data,” in Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, G. Elidan, K. Kersting, and A. T. Ihler, Eds., 2017, pp. 1–10.
  31. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in IEEE conference on computer vision and pattern recognition, 2009, pp. 248–255.
  32. Y. Gao, X. Han, X. Wang, W. Huang, and M. Scott, “Channel interaction networks for fine-grained image categorization,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, 2020, pp. 10 818–10 825.
  33. W. Luo, H. Zhang, J. Li, and X.-S. Wei, “Learning semantically enhanced feature for fine-grained image classification,” IEEE Signal Processing Letters, vol. 27, pp. 1545–1549, 2020.
  34. Z. Wang, S. Wang, S. Yang, H. Li, J. Li, and Z. Li, “Weakly supervised fine-grained image classification via guassian mixture model oriented discriminative learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 9749–9758.
  35. M. Zhou, Y. Bai, W. Zhang, T. Zhao, and T. Mei, “Look-into-object: Self-supervised structure modeling for object recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 774–11 783.
  36. M. S. Tanveer, M. U. K. Khan, and C.-M. Kyung, “Fine-tuning darts for image classification,” in 2020 25th International Conference on Pattern Recognition, 2021, pp. 4789–4796.
  37. H. Touvron, A. Sablayrolles, M. Douze, M. Cord, and H. Jégou, “Grafit: Learning fine-grained image representations with coarse labels,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 874–884.
  38. J. He, J.-N. Chen, S. Liu, A. Kortylewski, C. Yang, Y. Bai, C. Wang, and A. Yuille, “Transfg: A transformer architecture for fine-grained recognition,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2022, pp. 852–860.
  39. T. Ridnik, H. Lawen, A. Noy, E. Ben Baruch, G. Sharir, and I. Friedman, “Tresnet: High performance gpu-dedicated architecture,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 1400–1409.
  40. G. Hinton, O. Vinyals, J. Dean et al., “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, vol. 2, no. 7, 2015.
  41. W. Park, D. Kim, Y. Lu, and M. Cho, “Relational knowledge distillation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3967–3976.
  42. B. Peng, X. Jin, J. Liu, D. Li, Y. Wu, Y. Liu, S. Zhou, and Z. Zhang, “Correlation congruence for knowledge distillation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 5007–5016.
  43. F. Tung and G. Mori, “Similarity-preserving knowledge distillation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1365–1374.
  44. J. Ba and R. Caruana, “Do deep nets really need to be deep?” Advances in neural information processing systems, vol. 27, pp. 1–9, 2014.
  45. W. M. Czarnecki, S. Osindero, M. Jaderberg, G. Swirszcz, and R. Pascanu, “Sobolev training for neural networks,” Advances in Neural Information Processing Systems, vol. 30, pp. 1–10, 2017.
  46. N. Passalis and A. Tefas, “Learning deep representations with probabilistic knowledge transfer,” in Proceedings of the European Conference on Computer Vision, 2018, pp. 268–284.
  47. J. Yim, D. Joo, J. Bae, and J. Kim, “A gift from knowledge distillation: Fast optimization, network minimization and transfer learning,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4133–4141.
  48. Y. Liu, J. Cao, B. Li, C. Yuan, W. Hu, Y. Li, and Y. Duan, “Knowledge distillation via instance relationship graph,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7096–7104.
  49. N. Komodakis and S. Zagoruyko, “Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer,” in International Conference on Learning Representations, 2017, pp. 1–13.
  50. B. Heo, M. Lee, S. Yun, and J. Y. Choi, “Knowledge transfer via distillation of activation boundaries formed by hidden neurons,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 2019, pp. 3779–3787.
  51. P. Dhar, R. V. Singh, K.-C. Peng, Z. Wu, and R. Chellappa, “Learning without memorizing,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5138–5146.
  52. A. Romero, “Polytechnique montréal, y. bengio, université de montréal, adriana romero, nicolas ballas, samira ebrahimi kahou, antoine chassang, carlo gatta, and yoshua bengio. fitnets: Hints for thin deep nets,” in in International Conference on Learning Representations, 2015.
  53. Q. Wang, L. Liu, W. Yu, Z. Zhang, Y. Liu, S. Cheng, X. Zhang, and J. Gong, “Fpd: Feature pyramid knowledge distillation,” in International Conference on Neural Information Processing, 2022, pp. 100–111.
  54. C. Xu, W. Gao, T. Li, N. Bai, G. Li, and Y. Zhang, “Teacher-student collaborative knowledge distillation for image classification,” Applied Intelligence, vol. 53, no. 2, pp. 1997–2009, 2023.
  55. S. Tan, R. Guo, J. Tang, N. Jiang, and J. Zou, “Knowledge distillation based on multi-layer fusion features,” Plos one, vol. 18, no. 8, p. e0285901, 2023.
  56. P. Chatterjee and P. Milanfar, “Clustering-based denoising with locally learned dictionaries,” IEEE Transactions on Image Processing, vol. 18, no. 7, pp. 1438–1451, 2009.
  57. T. Kalampokas and G. A. Papakostas, “Moment transform-based compressive sensing in image processing,” in International Conference on Systems, Signals and Image Processing, 2021, pp. 96–107.
  58. Z. Wang, S. Wang, H. Li, Z. Dou, and J. Li, “Graph-propagation based correlation learning for weakly supervised fine-grained image classification,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, 2020, pp. 12 289–12 296.
  59. H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-efficient image transformers & distillation through attention,” in International Conference on Machine Learning, 2021, pp. 10 347–10 357.
  60. Z. Lu, G. Sreekumar, E. Goodman, W. Banzhaf, K. Deb, and V. N. Boddeti, “Neural architecture transfer,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 9, pp. 2971–2989, 2021.
  61. M. Chen, H. Peng, J. Fu, and H. Ling, “Autoformer: Searching transformers for visual recognition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 12 270–12 280.
  62. Z. Ma, D. Chang, J. Xie, Y. Ding, S. Wen, X. Li, Z. Si, and J. Guo, “Fine-grained vehicle classification with channel max pooling modified cnns,” IEEE Transactions on Vehicular Technology, vol. 68, no. 4, pp. 3224–3233, 2019.
  63. F. C. Soon, H. Y. Khaw, J. H. Chuah, and J. Kanesan, “Semisupervised pca convolutional network for vehicle type classification,” IEEE Transactions on Vehicular Technology, vol. 69, no. 8, pp. 8267–8277, 2020.
  64. S. Awang and N. M. A. N. Azmi, “Performance evaluation between rgb and ycrcb in tc-sf-cnnls for vehicle type recognition system,” in IEEE 8th International Conference on Industrial Engineering and Applications, 2021, pp. 550–555.
  65. C.-J. Lin and J.-Y. Jhang, “Intelligent traffic-monitoring system based on yolo and convolutional fuzzy neural networks,” IEEE Access, vol. 10, pp. 14 120–14 133, 2022.
  66. Y. Zhang, Y. Liu, G. Yang, and J. Song, “Dft: A deep feature-based semi-supervised collaborative training for vehicle recognition in smart cities,” Expert Systems, vol. 39, no. 5, p. e12916, 2022.
  67. X. Chen, “Vehicle feature recognition via a convolutional neural network with an improved bird swarm algorithm,” Journal of Internet Technology, vol. 24, no. 2, pp. 421–432, 2023.

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets