Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Next Generation Loss Function for Image Classification (2404.12948v1)

Published 19 Apr 2024 in cs.CV, cs.LG, and cs.NE

Abstract: Neural networks are trained by minimizing a loss function that defines the discrepancy between the predicted model output and the target value. The selection of the loss function is crucial to achieve task-specific behaviour and highly influences the capability of the model. A variety of loss functions have been proposed for a wide range of tasks affecting training and model performance. For classification tasks, the cross entropy is the de-facto standard and usually the first choice. Here, we try to experimentally challenge the well-known loss functions, including cross entropy (CE) loss, by utilizing the genetic programming (GP) approach, a population-based evolutionary algorithm. GP constructs loss functions from a set of operators and leaf nodes and these functions are repeatedly recombined and mutated to find an optimal structure. Experiments were carried out on different small-sized datasets CIFAR-10, CIFAR-100 and Fashion-MNIST using an Inception model. The 5 best functions found were evaluated for different model architectures on a set of standard datasets ranging from 2 to 102 classes and very different sizes. One function, denoted as Next Generation Loss (NGL), clearly stood out showing same or better performance for all tested datasets compared to CE. To evaluate the NGL function on a large-scale dataset, we tested its performance on the Imagenet-1k dataset where it showed improved top-1 accuracy compared to models trained with identical settings and other losses. Finally, the NGL was trained on a segmentation downstream task for Pascal VOC 2012 and COCO-Stuff164k datasets improving the underlying model performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  2. U-net: Convolutional networks for biomedical image segmentation. ArXiv, abs/1505.04597, 2015. URL https://api.semanticscholar.org/CorpusID:3719281.
  3. Ross Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015.
  4. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2015. URL https://api.semanticscholar.org/CorpusID:206594692.
  5. Rethinking the inception architecture for computer vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2818–2826, 2015. URL https://api.semanticscholar.org/CorpusID:206593880.
  6. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40:834–848, 2016. URL https://api.semanticscholar.org/CorpusID:3429309.
  7. An image is worth 16x16 words: Transformers for image recognition at scale. ArXiv, abs/2010.11929, 2020. URL https://api.semanticscholar.org/CorpusID:225039882.
  8. Elements of information theory (2. ed.). 2006. URL https://api.semanticscholar.org/CorpusID:702542.
  9. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017a.
  10. Advantages of the mean absolute error (mae) over the root mean square error (rmse) in assessing average model performance. Climate Research, 30:79–82, 2005. URL https://api.semanticscholar.org/CorpusID:120556606.
  11. Symbolic regression via genetic programming. In Proceedings. Vol.1. Sixth Brazilian Symposium on Neural Networks, pages 173–178, 2000.
  12. Exploring Hyper-heuristic Methodologies with Genetic Programming, pages 177–201. Springer Berlin Heidelberg, Berlin, Heidelberg, 2009.
  13. Learning multiple layers of features from tiny images. 2009.
  14. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017.
  15. Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images. PeerJ, 6:e4568, 2018.
  16. Rotation equivariant CNNs for digital pathology. June 2018.
  17. Multi-class texture analysis in colorectal cancer histology. Scientific reports, 6:27988, 2016.
  18. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer Vision and Pattern Recognition Workshop, 2004.
  19. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
  20. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
  21. Coco-stuff: Thing and stuff classes in context. In Computer vision and pattern recognition (CVPR), 2018 IEEE conference on. IEEE, 2018.
  22. Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1–9, 2014. URL https://api.semanticscholar.org/CorpusID:206592484.
  23. Attention is all you need. In NIPS, 2017. URL https://api.semanticscholar.org/CorpusID:13756489.
  24. Training data-efficient image transformers & distillation through attention. In International Conference on Machine Learning, 2020. URL https://api.semanticscholar.org/CorpusID:229363322.
  25. Encoder-decoder with atrous separable convolution for semantic image segmentation. In ECCV, 2018.
  26. Recent advances on loss functions in deep learning for computer vision. Neurocomputing, 497:129–158, 2022. URL https://api.semanticscholar.org/CorpusID:248642142.
  27. Shruti Jadon. A survey of loss functions for semantic segmentation. 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), pages 1–7, 2020. URL https://api.semanticscholar.org/CorpusID:220128180.
  28. A novel active contour model based on modified symmetric cross entropy for remote sensing river image segmentation. Pattern Recognit., 67:396–409, 2017. URL https://api.semanticscholar.org/CorpusID:2500127.
  29. Automated image segmentation using improved pcnn model based on cross-entropy. Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004., pages 743–746, 2004. URL https://api.semanticscholar.org/CorpusID:11039022.
  30. Focal loss for dense object detection. 2017 IEEE International Conference on Computer Vision (ICCV), pages 2999–3007, 2017b. URL https://api.semanticscholar.org/CorpusID:47252984.
  31. Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. Deep learning in medical image analysis and multimodal learning for clinical decision support : Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017, held in conjunction with MICCAI 2017 Quebec City, QC,…, 2017:240–248, 2017. URL https://api.semanticscholar.org/CorpusID:21957663.
  32. Tversky loss function for image segmentation using 3d fully convolutional deep networks. In MLMI@MICCAI, 2017. URL https://api.semanticscholar.org/CorpusID:732793.
  33. Weighted hausdorff distance: A loss function for object localization. ArXiv, abs/1806.07564, 2018. URL https://api.semanticscholar.org/CorpusID:49322181.
  34. 3d segmentation with exponential logarithmic loss for highly unbalanced object sizes. ArXiv, abs/1809.00076, 2018. URL https://api.semanticscholar.org/CorpusID:52157209.
  35. Meta-gradient reinforcement learning. In Neural Information Processing Systems, 2018. URL https://api.semanticscholar.org/CorpusID:43966764.
  36. Discovering reinforcement learning algorithms. ArXiv, abs/2007.08794, 2020. URL https://api.semanticscholar.org/CorpusID:220633409.
  37. Automl: A survey of the state-of-the-art. ArXiv, abs/1908.00709, 2019. URL https://api.semanticscholar.org/CorpusID:199405568.
  38. Autoloss-zero: Searching loss functions from scratch for generic tasks. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 999–1008, 2021. URL https://api.semanticscholar.org/CorpusID:232352674.
  39. Loss function search for face recognition. ArXiv, abs/2007.06542, 2020. URL https://api.semanticscholar.org/CorpusID:220496317.
  40. Am-lfs: Automl for loss function search. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 8409–8418, 2019. URL https://api.semanticscholar.org/CorpusID:158046703.
  41. Automl-zero: Evolving machine learning algorithms from scratch. In International Conference on Machine Learning, 2020. URL https://api.semanticscholar.org/CorpusID:212634211.
  42. Improved training speed, accuracy, and data utilization through loss function optimization. 2020 IEEE Congress on Evolutionary Computation (CEC), pages 1–8, 2019. URL https://api.semanticscholar.org/CorpusID:167217832.
  43. Adam: A method for stochastic optimization. In Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1412.6980.
  44. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, 2015. URL https://api.semanticscholar.org/CorpusID:5808102.
  45. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(56):1929–1958, 2014.
  46. Decoupled weight decay regularization, 2019.
  47. Kazuto Nakashima. deeplab-pytorch. https://github.com/kazuto1011/deeplab-pytorch, 2018.

Summary

We haven't generated a summary for this paper yet.