Papers
Topics
Authors
Recent
AI Research Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 89 tok/s
Gemini 2.5 Pro 43 tok/s Pro
GPT-5 Medium 24 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 112 tok/s Pro
Kimi K2 199 tok/s Pro
GPT OSS 120B 449 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Multiple Random Masking Autoencoder Ensembles for Robust Multimodal Semi-supervised Learning (2402.08035v1)

Published 12 Feb 2024 in cs.CV

Abstract: There is an increasing number of real-world problems in computer vision and machine learning requiring to take into consideration multiple interpretation layers (modalities or views) of the world and learn how they relate to each other. For example, in the case of Earth Observations from satellite data, it is important to be able to predict one observation layer (e.g. vegetation index) from other layers (e.g. water vapor, snow cover, temperature etc), in order to best understand how the Earth System functions and also be able to reliably predict information for one layer when the data is missing (e.g. due to measurement failure or error).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. (2022) 16000–16009
  2. 4m: Massively multimodal masked modeling. arXiv preprint arXiv:2312.06647 (2023)
  3. Multimae: Multi-modal multi-task masked autoencoders. In: European Conference on Computer Vision, Springer (2022) 348–367
  4. Self-supervised learning with geometric constraints in monocular video: Connecting flow, depth, and camera. In: Proceedings of the IEEE International Conference on Computer Vision. (2019) 7063–7072
  5. Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2017) 1851–1858
  6. Competitive collaboration: Joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2019) 12240–12249
  7. Unsupervised scale-consistent depth and ego-motion learning from monocular video. In: Advances in Neural Information Processing Systems. (2019) 35–45
  8. Depth from videos in the wild: Unsupervised monocular depth learning from unknown cameras. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. (2019) 8977–8986
  9. Unsupervised learning of geometry from videos with edge-aware depth-normal consistency. In: Thirty-Second AAAI Conference on Artificial Intelligence. (2018)
  10. Distilled semantics for comprehensive scene understanding from videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. (2020) 4654–4665
  11. Semantically-guided representation learning for self-supervised monocular depth. arXiv preprint arXiv:2002.12319 (2020)
  12. Towards scene understanding: Unsupervised monocular depth estimation with semantic-aware representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2019) 2624–2632
  13. Casting geometric constraints in semantic segmentation as semi-supervised learning. In: The IEEE Winter Conference on Applications of Computer Vision. (2020) 1854–1863
  14. Deep multimodal clustering for unsupervised audiovisual learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (June 2019)
  15. Connecting touch and vision via cross-modal prediction. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (June 2019)
  16. Split-brain autoencoders: Unsupervised learning by cross-channel prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2017) 1058–1067
  17. Automatic multimedia cross-modal correlation discovery. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM (2004) 653–658
  18. Unsupervised cross-modal retrieval through adversarial learning. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), IEEE (2017) 1153–1158
  19. The sound of pixels. In: Proceedings of the European Conference on Computer Vision (ECCV). (2018) 570–586
  20. Taskonomy: Disentangling task transfer learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. (2018) 3712–3722
  21. Robust learning through cross-task consistency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. (2020) 11197–11206
  22. Efficiently identifying task groupings for multi-task learning. Advances in Neural Information Processing Systems 34 (2021) 27503–27516
  23. Self-supervised hypergraphs for learning multiple world interpretations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. (2023) 983–992
  24. Multi-task hypergraphs for semi-supervised learning using earth observations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. (2023) 3404–3414
  25. Semi-supervised learning for multi-task scene understanding by neural graph consensus. In: Proceedings of the AAAI Conference on Artificial Intelligence. Volume 35. (2021) 1882–1892
  26. The emergence and evolution of earth system science. Nature Reviews Earth & Environment 1(1) (2020) 54–63
  27. Deflecting adversarial attacks with pixel deflection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. (2018) 8571–8580
  28. Pitfalls of in-domain uncertainty estimation and ensembling in deep learning. arXiv preprint arXiv:2002.06470 (2020)
  29. Memo: Test time robustness via adaptation and augmentation. Advances in Neural Information Processing Systems 35 (2022) 38629–38642

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.