Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 76 tok/s

Gemini 2.5 Pro 58 tok/s Pro

GPT-5 Medium 26 tok/s Pro

GPT-5 High 25 tok/s Pro

GPT-4o 81 tok/s Pro

Kimi K2 206 tok/s Pro

GPT OSS 120B 465 tok/s Pro

Claude Sonnet 4 35 tok/s Pro

2000 character limit reached

Multi-QuAD: Multi-Level Quality-Adaptive Dynamic Network for Reliable Multimodal Classification (2412.14489v3)

Published 19 Dec 2024 in cs.CV

Abstract: Multimodal machine learning has achieved remarkable progress in many scenarios, but its reliability is undermined by varying sample quality. This paper finds that existing reliable multimodal classification methods not only fail to provide robust estimation of data quality, but also lack dynamic networks for sample-specific depth and parameters to achieve reliable inference. To this end, a novel framework for multimodal reliable classification termed \textit{Multi-level Quality-Adaptive Dynamic multimodal network} (Multi-QuAD) is proposed. Multi-QuAD first adopts a novel approach based on noise-free prototypes and a classifier-free design to reliably estimate the quality of each sample at both modality and feature levels. It then achieves sample-specific network depth via the \textbf{\textit{Global Confidence Normalized Depth (GCND)}} mechanism. By normalizing depth across modalities and samples, \textit{\textbf{GCND}} effectively mitigates the impact of challenging modality inputs on dynamic depth reliability. Furthermore, Multi-QuAD provides sample-adaptive network parameters via the \textbf{\textit{Layer-wise Greedy Parameter (LGP)}} mechanism driven by feature-level quality. The cross-modality layer-wise greedy strategy in \textbf{\textit{LGP}} designs a reliable parameter prediction paradigm for multimodal networks with variable architecture for the first time. Experiments conducted on four datasets demonstrate that Multi-QuAD significantly outperforms state-of-the-art methods in classification performance and reliability, exhibiting strong adaptability to data with diverse quality.