Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhance Sketch Recognition's Explainability via Semantic Component-Level Parsing (2312.07875v1)

Published 13 Dec 2023 in cs.CV

Abstract: Free-hand sketches are appealing for humans as a universal tool to depict the visual world. Humans can recognize varied sketches of a category easily by identifying the concurrence and layout of the intrinsic semantic components of the category, since humans draw free-hand sketches based a common consensus that which types of semantic components constitute each sketch category. For example, an airplane should at least have a fuselage and wings. Based on this analysis, a semantic component-level memory module is constructed and embedded in the proposed structured sketch recognition network in this paper. The memory keys representing semantic components of each sketch category can be self-learned and enhance the recognition network's explainability. Our proposed networks can deal with different situations of sketch recognition, i.e., with or without semantic components labels of strokes. Experiments on the SPG and SketchIME datasets demonstrate the memory module's flexibility and the recognition network's explainability. The code and data are available at https://github.com/GuangmingZhu/SketchESC.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Abstracting sketches through simple primitives. In ECCV, 396–412.
  2. Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks. In WACV, 839–847.
  3. This looks like that: deep learning for interpretable image recognition. volume 32.
  4. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.
  5. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In ICLR.
  6. Interpretable part-whole hierarchies and conceptual-semantic relationships in neural networks. In CVPR, 13689–13698.
  7. A peek into the reasoning of neural networks: Interpreting with structural visual concepts. In CVPR, 2195–2204.
  8. A neural representation of sketch drawings. arXiv preprint arXiv:1704.03477.
  9. A Neural Representation of Sketch Drawings. In ICLR.
  10. Memory-Based Graph Networks. In ICLR.
  11. Deepgcns: Can gcns go as deep as cnns? In ICCV, 9267–9276.
  12. Universal sketch perceptual grouping. In ECCV, 582–597.
  13. Sketch-R2CNN: an RNN-rasterization-CNN architecture for vector sketch recognition. IEEE TVCG, 27(9): 3745–3754.
  14. Prediction with Visual Evidence: Sketch Classification Explanation via Stroke-Level Attributions. IEEE TIP.
  15. Miller, T. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence, 267: 1–38.
  16. Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models. arXiv preprint arXiv:1908.01224.
  17. Distribution-aware binarization of neural networks for sketch recognition. In WACV, 830–838.
  18. SketchXAI: A First Look at Explainability for Human Sketches. In CVPR, 23327–23337.
  19. Ramaswamy, H. G.; et al. 2020. Ablation-cam: Visual explanations for deep convolutional network via gradient-free localization. In WACV, 983–991.
  20. Enabling my robot to play pictionary: Recurrent neural networks for sketch recognition. In ACM MM, 247–251.
  21. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In ICCV, 618–626.
  22. One explanation is not enough: structured attention graphs for image classification. volume 34, 11352–11363.
  23. Visualizing data using t-SNE. JMLR, 9(11).
  24. Interpretable counterfactual explanations guided by prototypes. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 650–665.
  25. Dynamic graph cnn for learning on point clouds. ACM TOG, 38(5): 1–12.
  26. Sketchsegnet: A rnn model for labeling sketch strokes. In MLSP, 1–6.
  27. Deep learning for free-hand sketch: A survey. IEEE TPAMI, 45(1): 285–312.
  28. Sketchmate: Deep hashing for million-scale human sketch retrieval. In CVPR, 8090–8098.
  29. Multigraph transformer for free-hand sketch recognition. IEEE TNNLS, 33(10): 5150–5161.
  30. Sketchgnn: Semantic sketch segmentation with graph neural networks. ACM TOG, 40(3): 1–13.
  31. Sketch-a-net: A deep neural network that beats humans. IJCV, 122: 411–425.
  32. Structpool: Structured graph pooling via conditional random fields. In ICLR.
  33. Interpreting CNN knowledge via an explanatory graph. In AAAI, 4454–4463.
  34. A survey on freehand sketch recognition and retrieval. IMAVIS, 89: 67–87.
  35. Sketch Input Method Editor: A Comprehensive Dataset and Methodology for Systematic Input Recognition. In ACM MM.
Citations (1)

Summary

We haven't generated a summary for this paper yet.