Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 84 tok/s
Gemini 2.5 Pro 57 tok/s Pro
GPT-5 Medium 23 tok/s
GPT-5 High 17 tok/s Pro
GPT-4o 101 tok/s
GPT OSS 120B 458 tok/s Pro
Kimi K2 206 tok/s Pro
2000 character limit reached

GCAM: Gaussian and causal-attention model of food fine-grained recognition (2403.12109v1)

Published 18 Mar 2024 in cs.LG, cs.AI, and cs.CV

Abstract: Currently, most food recognition relies on deep learning for category classification. However, these approaches struggle to effectively distinguish between visually similar food samples, highlighting the pressing need to address fine-grained issues in food recognition. To mitigate these challenges, we propose the adoption of a Gaussian and causal-attention model for fine-grained object recognition.In particular, we train to obtain Gaussian features over target regions, followed by the extraction of fine-grained features from the objects, thereby enhancing the feature mapping capabilities of the target regions. To counteract data drift resulting from uneven data distributions, we employ a counterfactual reasoning approach. By using counterfactual interventions, we analyze the impact of the learned image attention mechanism on network predictions, enabling the network to acquire more useful attention weights for fine-grained image recognition. Finally, we design a learnable loss strategy to balance training stability across various modules, ultimately improving the accuracy of the final target recognition. We validate our approach on four relevant datasets, demonstrating its excellent performance across these four datasets.We experimentally show that GCAM surpasses state-of-the-art methods on the ETH-FOOD101, UECFOOD256, and Vireo-FOOD172 datasets. Furthermore, our approach also achieves state-of-the-art performance on the CUB-200 dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Sunil K Khanna. Food and culture: A reader , by carole counihanand penny van esterik: New york: Routledge, 608 pp., 2009.
  2. “snap-n-eat” food recognition and nutrition estimation on a smartphone. Journal of diabetes science and technology, 9(3):525–533, 2015.
  3. A framework to estimate the nutritional value of food in real time using deep learning techniques. IEEE Access, 7:2643–2652, 2018.
  4. Using deep learning for food and beverage image recognition. In 2019 IEEE International Conference on Big Data (Big Data), pages 5149–5151. IEEE, 2019.
  5. Cross-depiction problem: Recognition and synthesis of photographs and artwork. Computational Visual Media, 1:91–103, 2015.
  6. Counterfactual attention learning for fine-grained visual categorization and re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1025–1034, 2021.
  7. A survey on food computing. ACM Computing Surveys (CSUR), 52(5):1–36, 2019.
  8. Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In Proceedings of 12th international conference on pattern recognition, volume 1, pages 582–585. IEEE, 1994.
  9. Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), volume 1, pages 886–893. Ieee, 2005.
  10. David G Lowe. Object recognition from local scale-invariant features. In Proceedings of the seventh IEEE international conference on computer vision, volume 2, pages 1150–1157. Ieee, 1999.
  11. Food detection and recognition using convolutional neural network. In Proceedings of the 22nd ACM international conference on Multimedia, pages 1085–1088, 2014.
  12. Food photo recognition for dietary tracking: System and experiment. In MultiMedia Modeling: 24th International Conference, MMM 2018, Bangkok, Thailand, February 5-7, 2018, Proceedings, Part II 24, pages 129–141. Springer, 2018.
  13. Hyperspectral fruit and vegetable classification using convolutional neural networks. Computers and Electronics in Agriculture, 162:364–372, 2019.
  14. Wide-slice residual networks for food recognition. In 2018 IEEE Winter conference on applications of computer vision (WACV), pages 567–576. IEEE, 2018.
  15. A study of multi-task and region-wise deep learning for food ingredient recognition. IEEE Transactions on Image Processing, 30:1514–1526, 2020.
  16. A survey of recent advances in cnn-based fine-grained visual categorization. In 2020 IEEE 20th International Conference on Communication Technology (ICCT), pages 1377–1384. IEEE, 2020.
  17. Fine-grained visual computing based on deep learning. ACM Transactions on Multimidia Computing Communications and Applications, 17(1s):1–19, 2021.
  18. Fine-grained ship recognition for complex background based on global to local and progressive learning. IEEE Geoscience and Remote Sensing Letters, 19:1–5, 2022.
  19. Vegfru: A domain-specific dataset for fine-grained visual categorization. In Proceedings of the IEEE International Conference on Computer Vision, pages 541–549, 2017.
  20. See better before looking closer: Weakly supervised data augmentation network for fine-grained visual classification. arXiv preprint arXiv:1901.09891, 2019.
  21. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  22. Centernet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6569–6578, 2019.
  23. Human pose regression with residual log-likelihood estimation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 11025–11034, 2021.
  24. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  25. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
  26. Chee Sun Won. Multi-scale cnn for fine-grained image recognition. IEEE Access, 8:116663–116674, 2020.
  27. Deepfood: Deep learning-based food image recognition for computer-aided dietary assessment. In Inclusive Smart Cities and Digital Health: 14th International Conference on Smart Homes and Health Telematics, ICOST 2016, Wuhan, China, May 25-27, 2016. Proceedings 14, pages 37–48. Springer, 2016.
  28. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  29. Food image recognition using very deep convolutional networks. In Proceedings of the 2nd international workshop on multimedia assisted dietary management, pages 41–49, 2016.
  30. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017.
  31. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018.
  32. Learning to navigate for fine-grained classification. In Proceedings of the European conference on computer vision (ECCV), pages 420–435, 2018.
  33. Fine-grained visual classification via progressive multi-granularity training of jigsaw patches. In European Conference on Computer Vision, pages 153–168. Springer, 2020.
  34. Destruction and construction learning for fine-grained image recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5157–5166, 2019.
  35. Deep-based ingredient recognition for cooking recipe retrieval. In Proceedings of the 24th ACM international conference on Multimedia, pages 32–41, 2016.
  36. Food-101–mining discriminative components with random forests. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI 13, pages 446–461. Springer, 2014.
  37. Automatic expansion of a food image dataset leveraging existing categories with domain adaptation. In Computer Vision-ECCV 2014 Workshops: Zurich, Switzerland, September 6-7 and 12, 2014, Proceedings, Part III 13, pages 3–17. Springer, 2015.
  38. Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4438–4446, 2017.
  39. Learning multi-attention convolutional neural network for fine-grained image recognition. In Proceedings of the IEEE international conference on computer vision, pages 5209–5217, 2017.
  40. See better before looking closer: Weakly supervised data augmentation network for fine-grained visual classification. arxiv 2019. arXiv preprint arXiv:1901.09891, 1901.
  41. Learning attentive pairwise interaction for fine-grained classification. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 13130–13137, 2020.
  42. The caltech-ucsd birds-200-2011 dataset. 2011.
Citations (2)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets