Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DC4L: Distribution Shift Recovery via Data-Driven Control for Deep Learning Models (2302.10341v3)

Published 20 Feb 2023 in cs.LG and cs.CV

Abstract: Deep neural networks have repeatedly been shown to be non-robust to the uncertainties of the real world, even to naturally occurring ones. A vast majority of current approaches have focused on data-augmentation methods to expand the range of perturbations that the classifier is exposed to while training. A relatively unexplored avenue that is equally promising involves sanitizing an image as a preprocessing step, depending on the nature of perturbation. In this paper, we propose to use control for learned models to recover from distribution shifts online. Specifically, our method applies a sequence of semantic-preserving transformations to bring the shifted data closer in distribution to the training set, as measured by the Wasserstein distance. Our approach is to 1) formulate the problem of distribution shift recovery as a Markov decision process, which we solve using reinforcement learning, 2) identify a minimum condition on the data for our method to be applied, which we check online using a binary classifier, and 3) employ dimensionality reduction through orthonormal projection to aid in our estimates of the Wasserstein distance. We provide theoretical evidence that orthonormal projection preserves characteristics of the data at the distributional level. We apply our distribution shift recovery approach to the ImageNet-C benchmark for distribution shifts, demonstrating an improvement in average accuracy of up to 14.21% across a variety of state-of-the-art ImageNet classifiers. We further show that our method generalizes to composites of shifts from the ImageNet-C benchmark, achieving improvements in average accuracy of up to 9.81%. Finally, we test our method on CIFAR-100-C and report improvements of up to 8.25%.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Wasserstein generative adversarial networks. In International conference on machine learning, pages 214–223. PMLR, 2017.
  2. Agreement-on-the-line: Predicting the performance of neural networks under distribution shift. Advances in Neural Information Processing Systems, 35:19274–19289, 2022.
  3. A theory of learning from different domains. Machine learning, 79:151–175, 2010.
  4. Displacement interpolation using lagrangian mass transport. In Proceedings of the 2011 SIGGRAPH Asia conference, pages 1–12, 2011.
  5. Unsupervised pixel-level domain adaptation with generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3722–3731, 2017.
  6. Distances between probability distributions of different dimensions. IEEE Transactions on Information Theory, 2022.
  7. Adaptive wavelet thresholding for image denoising and compression. IEEE transactions on image processing, 9(9):1532–1546, 2000.
  8. Ideal spatial adaptation by wavelet shrinkage. biometrika, 81(3):425–455, 1994.
  9. Noisymix: Boosting robustness by combining data augmentations, stability training, and noise injections. CoRR, abs/2202.01263, 2022. URL https://arxiv.org/abs/2202.01263.
  10. Test-time training with masked autoencoders. Advances in Neural Information Processing Systems, 35:29374–29385, 2022.
  11. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
  12. Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117, 2017.
  13. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  14. Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261, 2019.
  15. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks, October 2018. URL http://arxiv.org/abs/1610.02136. arXiv:1610.02136 [cs].
  16. Augmix: A simple data processing method to improve robustness and uncertainty. arXiv preprint arXiv:1912.02781, 2019.
  17. The many faces of robustness: A critical analysis of out-of-distribution generalization. CoRR, abs/2006.16241, 2020. URL https://arxiv.org/abs/2006.16241.
  18. Cycada: Cycle-consistent adversarial domain adaptation. In International conference on machine learning, pages 1989–1998. Pmlr, 2018.
  19. Leonid Vital’evič Kantorovič. On a problem of Monge. Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 312(11):15–16, 2004.
  20. CODiT: Conformal Out-of-Distribution Detection in Time-Series Data, July 2022. URL http://arxiv.org/abs/2207.11769. arXiv:2207.11769 [cs].
  21. Learning loss for test-time augmentation. Advances in Neural Information Processing Systems, 33:4163–4174, 2020a.
  22. Puzzle mix: Exploiting saliency and local statistics for optimal mixup. CoRR, abs/2009.06962, 2020b. URL https://arxiv.org/abs/2009.06962.
  23. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  24. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, May 2017. URL http://arxiv.org/abs/1609.04802. arXiv:1609.04802 [cs, stat] version: 5.
  25. Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples, February 2018. URL http://arxiv.org/abs/1711.09325. arXiv:1711.09325 [cs, stat].
  26. Greedy policy search: A simple baseline for learnable test-time augmentation. In Conference on Uncertainty in Artificial Intelligence, pages 1308–1317. PMLR, 2020.
  27. Why do classifier accuracies show linear trends under distribution shift? arXiv preprint arXiv:2012.15483, 2020.
  28. Accuracy on the line: on the strong correlation between out-of-distribution and in-distribution generalization. In International Conference on Machine Learning, pages 7721–7735. PMLR, 2021.
  29. Test-time adaptation to distribution shift by confidence maximization and input transformation. arXiv preprint arXiv:2106.14999, 2021.
  30. Verifying conformance of neural network models: Invited paper. In 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 1–8, 2019. 10.1109/ICCAD45719.2019.8942151.
  31. Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift, October 2019. URL http://arxiv.org/abs/1810.11953. arXiv:1810.11953 [cs, stat].
  32. On wasserstein two-sample testing and related families of nonparametric tests. Entropy, 19(2):47, 2017.
  33. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  34. Manifold mixup: Better representations by interpolating hidden states. In ICML, 2019.
  35. Cedric Villani. Optimal Transport: Old and New. Springer Berlin, Heidelberg, 2009.
  36. Continual test-time domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7201–7211, 2022.
  37. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, 13(4):600–612, April 2004. 10.1109/TIP.2003.819861.
  38. Cutmix: Regularization strategy to train strong classifiers with localizable features. CoRR, abs/1905.04899, 2019. URL http://arxiv.org/abs/1905.04899.
  39. Wide residual networks. arXiv preprint arXiv:1605.07146, 2016.
  40. Learning Enriched Features for Real Image Restoration and Enhancement. In Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, editors, Computer Vision – ECCV 2020, volume 12370, pages 492–511. Springer International Publishing, Cham, 2020. ISBN 978-3-030-58594-5 978-3-030-58595-2. 10.1007/978-3-030-58595-2_30. URL https://link.springer.com/10.1007/978-3-030-58595-2_30. Series Title: Lecture Notes in Computer Science.
  41. Multi-Stage Progressive Image Restoration, March 2021. URL http://arxiv.org/abs/2102.02808. arXiv:2102.02808 [cs].
  42. Learning Enriched Features for Fast Image Restoration and Enhancement, April 2022. URL http://arxiv.org/abs/2205.01649. arXiv:2205.01649 [cs, eess].
  43. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Transactions on Image Processing, 26(7):3142–3155, July 2017. ISSN 1941-0042. 10.1109/TIP.2017.2662206. Conference Name: IEEE Transactions on Image Processing.
  44. Deep Image Deblurring: A Survey, May 2022. URL http://arxiv.org/abs/2201.10700. arXiv:2201.10700 [cs].
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets