Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AUGCAL: Improving Sim2Real Adaptation by Uncertainty Calibration on Augmented Synthetic Images (2312.06106v3)

Published 11 Dec 2023 in cs.CV and cs.LG

Abstract: Synthetic data (SIM) drawn from simulators have emerged as a popular alternative for training models where acquiring annotated real-world images is difficult. However, transferring models trained on synthetic images to real-world applications can be challenging due to appearance disparities. A commonly employed solution to counter this SIM2REAL gap is unsupervised domain adaptation, where models are trained using labeled SIM data and unlabeled REAL data. Mispredictions made by such SIM2REAL adapted models are often associated with miscalibration - stemming from overconfident predictions on real data. In this paper, we introduce AUGCAL, a simple training-time patch for unsupervised adaptation that improves SIM2REAL adapted models by - (1) reducing overall miscalibration, (2) reducing overconfidence in incorrect predictions and (3) improving confidence score reliability by better guiding misclassification detection - all while retaining or improving SIM2REAL performance. Given a base SIM2REAL adaptation algorithm, at training time, AUGCAL involves replacing vanilla SIM images with strongly augmented views (AUG intervention) and additionally optimizing for a training time calibration loss on augmented SIM predictions (CAL intervention). We motivate AUGCAL using a brief analytical justification of how to reduce miscalibration on unlabeled REAL data. Through our experiments, we empirically show the efficacy of AUGCAL across multiple adaptation methods, backbones, tasks and shifts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (79)
  1. Semantic segmentation learning for autonomous uavs using simulators and real data. In 2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP), pp.  303–310, 2019. doi: 10.1109/ICCP48234.2019.8959563.
  2. Meta-calibration: Learning of model calibration using differentiable expected calibration error. arXiv preprint arXiv:2106.09613, 2021.
  3. Robustnav: Towards benchmarking robustness in embodied navigation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  15691–15700, 2021.
  4. Pasta: Proportional amplitude spectrum training augmentation for syn-to-real domain generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.  19288–19300, October 2023.
  5. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848, 2017.
  6. Robustnet: Improving domain generalization in urban-scene segmentation via instance selective whitening. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  11580–11590, 2021.
  7. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  3213–3223, 2016.
  8. Learning bounds for importance weighting. In NIPS, 2010.
  9. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp.  702–703, 2020.
  10. Reliability in semantic segmentation: Are we on the right track? arXiv preprint arXiv:2303.11298, 2023.
  11. Robothor: An open simulation-to-real embodied ai platform. In CVPR, 2020.
  12. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.  248–255, 2009. doi: 10.1109/CVPR.2009.5206848.
  13. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  14. A brief review of domain adaptation. Advances in Data Science and Information Engineering: Proceedings from ICDATA 2020 and IKE 2020, pp.  877–894, 2021.
  15. Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412, 2020.
  16. Unsupervised domain adaptation by backpropagation. arXiv preprint arXiv:1409.7495, 2014.
  17. A survey of uncertainty in deep neural networks. arXiv preprint arXiv:2107.03342, 2021.
  18. Confidence calibration for domain generalization under covariate shift. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  8958–8967, 2021.
  19. On calibration of modern neural networks. In International conference on machine learning, pp. 1321–1330. PMLR, 2017.
  20. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  21. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  16000–16009, 2022.
  22. A stitch in time saves nine: A train-time regularizing loss for improved neural network calibration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  16081–16090, 2022.
  23. Cycada: Cycle-consistent adversarial domain adaptation. In International Conference on Machine Learning, pp. 1989–1998, 2018.
  24. Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  9924–9935, 2022a.
  25. Hrda: Context-aware high-resolution domain-adaptive semantic segmentation. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXX, pp.  372–391. Springer, 2022b.
  26. Mic: Masked image consistency for context-enhanced domain adaptation. arXiv preprint arXiv:2212.01322, 2022c.
  27. Fsdr: Frequency space domain randomization for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  6891–6902, 2021.
  28. Class-distribution-aware calibration for long-tailed visual recognition. arXiv preprint arXiv:2109.05263, 2021.
  29. Minimum class confusion for versatile domain adaptation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, pp.  464–480. Springer, 2020.
  30. Contrastive adaptation network for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  4893–4902, 2019.
  31. Soft calibration objectives for neural networks. Advances in Neural Information Processing Systems, 34:29768–29779, 2021.
  32. Segment anything. arXiv preprint arXiv:2304.02643, 2023.
  33. Beta calibration: a well-founded and easily implemented improvement on logistic calibration for binary classifiers. In Artificial Intelligence and Statistics, pp.  623–631. PMLR, 2017.
  34. Trainable calibration measures for neural networks from kernel mean embeddings. In International Conference on Machine Learning, pp. 2805–2814. PMLR, 2018.
  35. Simple and scalable predictive uncertainty estimation using deep ensembles. In NeurIPS, 2017.
  36. Dong-Hyun Lee et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, volume 3, pp.  896, 2013.
  37. Neural network calibration for medical imaging classification using dca regularization. In International Conference on Machine Learning (ICML), 2020a.
  38. Improved trainable calibration method for neural networks on medical imaging classification. arXiv preprint arXiv:2009.04057, 2020b.
  39. The devil is in the margin: Margin-based label smoothing for network calibration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  80–88, 2022.
  40. Learning transferable features with deep adaptation networks. In International conference on machine learning, pp. 97–105. PMLR, 2015.
  41. Conditional adversarial domain adaptation. Advances in neural information processing systems, 31, 2018.
  42. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  43. Andrey Malinin and Mark J. F. Gales. Predictive uncertainty estimation via prior networks. In NeurIPS, 2018.
  44. Ensemble distribution distillation. arXiv preprint arXiv:1905.00076, 2019.
  45. Evaluating uncertainty quantification in end-to-end autonomous driving control. arXiv preprint arXiv:1811.06817, 2018.
  46. Revisiting the calibration of modern neural networks. Advances in Neural Information Processing Systems, 34:15682–15694, 2021.
  47. Towards improving calibration in object detection under domain shift. Advances in Neural Information Processing Systems, 35:38706–38718, 2022.
  48. Obtaining well calibrated probabilities using bayesian binning. In Proceedings of the AAAI conference on artificial intelligence, volume 29, 2015.
  49. Intriguing properties of vision transformers. Advances in Neural Information Processing Systems, 34:23296–23308, 2021.
  50. Henri J Nussbaumer. The fast fourier transform. In Fast Fourier Transform and Convolution Algorithms, pp. 80–111. Springer, 1981.
  51. Unsupervised calibration under covariate shift. In arXiv:2006.16405, 2020.
  52. Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (eds.), Advances in Neural Information Processing Systems 32, pp.  8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
  53. Visda: The visual domain adaptation challenge. arXiv preprint arXiv:1710.06924, 2017.
  54. Why is real-world visual object recognition hard? PLoS computational biology, 4(1):e27, 2008.
  55. John Platt et al. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers, 10(3):61–74, 1999.
  56. Learning transferable visual models from natural language supervision. In International conference on machine learning, pp. 8748–8763. PMLR, 2021.
  57. A closer look at smoothness in domain adversarial training. In International Conference on Machine Learning, pp. 18378–18399. PMLR, 2022.
  58. Playing for data: Ground truth from computer games. In European conference on computer vision, pp.  102–118. Springer, 2016.
  59. Enhancing photorealism enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2):1700–1715, 2022.
  60. The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  3234–3243, 2016.
  61. Alfréd Rényi. On measures of information and entropy. In Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability, 1960.
  62. Adapting visual category models to new domains. In European conference on computer vision, pp.  213–226. Springer, 2010.
  63. Adversarial dropout regularization. In International Conference on Learning Representations, 2018.
  64. Generate to adapt: Aligning domains using generative adversarial networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
  65. Habitat: A platform for embodied ai research. In ICCV, 2019.
  66. Guardian Tesla Crash. Tesla driver dies in first fatal crash while using autopilot mode — theguardian.com. https://www.theguardian.com/technology/2016/jun/30/tesla-autopilot-death-self-driving-car-elon-musk, 2016. [Accessed 16-May-2023].
  67. Dacs: Domain adaptation via cross-domain mixed sampling. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp.  1379–1389, 2021.
  68. Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474, 2014.
  69. Adversarial discriminative domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  7167–7176, 2017.
  70. Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  2517–2526, 2019.
  71. Meta-learning for domain generalization in semantic parsing. arXiv preprint arXiv:2010.11988, 2020a.
  72. Practical bounds of kullback-leibler divergence using maximum mean discrepancy. arXiv preprint arXiv:2204.02031, 2022.
  73. Rethinking calibration of deep neural networks: Do not be afraid of overconfidence. Advances in Neural Information Processing Systems, 34:11809–11820, 2021.
  74. On calibrating semantic segmentation models: Analysis and an algorithm. arXiv preprint arXiv:2212.12053, 2022.
  75. Transferable calibration with lower bias and variance in domain adaptation. Advances in Neural Information Processing Systems, 33:19212–19223, 2020b.
  76. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34:12077–12090, 2021.
  77. Bridging theory and algorithm for domain adaptation. In International Conference on Machine Learning, pp. 7404–7413, 2019.
  78. Style-hallucinated dual consistency learning for domain generalized semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
  79. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In Proceedings of the European conference on computer vision (ECCV), pp.  289–305, 2018.

Summary

We haven't generated a summary for this paper yet.