Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Task-Compatible Compressible Representations (2405.10244v3)

Published 16 May 2024 in cs.CV and eess.SP

Abstract: We identify an issue in multi-task learnable compression, in which a representation learned for one task does not positively contribute to the rate-distortion performance of a different task as much as expected, given the estimated amount of information available in it. We interpret this issue using the predictive $\mathcal{V}$-information framework. In learnable scalable coding, previous work increased the utilization of side-information for input reconstruction by also rewarding input reconstruction when learning this shared representation. We evaluate the impact of this idea in the context of input reconstruction more rigorously and extended it to other computer vision tasks. We perform experiments using representations trained for object detection on COCO 2017 and depth estimation on the Cityscapes dataset, and use them to assist in image reconstruction and semantic segmentation tasks. The results show considerable improvements in the rate-distortion performance of the assisted tasks. Moreover, using the proposed representations, the performance of the base tasks are also improved. Results suggest that the proposed method induces simpler representations that are more compatible with downstream processes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. “The information bottleneck method,” CoRR, vol. physics/0004057, 2000.
  2. H. Choi and I. V. Bajić, “Scalable image coding for humans and machines,” IEEE TIP, vol. 31, pp. 2739–2754, 2022.
  3. “How transferable are features in deep neural networks?,” in NIPS, 2014.
  4. “A theory of learning from different domains,” Machine Learning, vol. 79, no. 1-2, pp. 151–175, 2010.
  5. “A theory of usable information under computational constraints,” in ICLR, 2020.
  6. “Conditional and residual methods in scalable coding for humans and machines,” in IEEE ICMEW, 2023.
  7. “Base layer efficiency in scalable human-machine coding,” in IEEE ICIP, 2023, pp. 3299–3303.
  8. “End-to-end optimized image compression,” in ICLR, 2017.
  9. “Variational image compression with a scale hyperprior,” in ICLR, 2018.
  10. “Learned image compression with discretized gaussian mixture likelihoods and attention modules,” in IEEE CVPR, 2020.
  11. “ELIC: efficient learned image compression with unevenly grouped space-channel contextual adaptive coding,” in IEEE CVPR, 2022.
  12. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” in ICLR, 2014.
  13. Elements of information theory, Wiley, 2nd edition, 2006.
  14. “MLIC: multi-reference entropy model for learned image compression,” in ACM, 2023.
  15. W. Jiang and R. Wang, “MLIC++: linear complexity multi-reference entropy modeling for learned image compression,” CoRR, vol. abs/2307.15421, 2023.
  16. “Deep residual learning for image recognition,” in IEEE CVPR, 2016.
  17. “Lossy image compression with compressive autoencoders,” in ICLR, 2017.
  18. “Microsoft COCO: common objects in context,” in ECCV, 2014.
  19. “The Cityscapes dataset for semantic urban scene understanding,” in IEEE CVPR, 2016.
  20. “Faster R-CNN: towards real-time object detection with region proposal networks,” IEEE TPAMI, vol. 39, no. 6, pp. 1137–1149, 2017.
  21. “Searching for MobileNetV3,” in IEEE ICCV, 2019.
  22. “Rethinking atrous convolution for semantic image segmentation,” CoRR, vol. abs/1706.05587, 2017.
  23. G. Bjontegaard, “Calculation of average PSNR differences between RD-curves,” ITU-T SC16/Q6 VCEG-M33, 2001.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com