Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CentralNet: a Multilayer Approach for Multimodal Fusion (1808.07275v1)

Published 22 Aug 2018 in cs.AI, cs.CV, and cs.MM

Abstract: This paper proposes a novel multimodal fusion approach, aiming to produce best possible decisions by integrating information coming from multiple media. While most of the past multimodal approaches either work by projecting the features of different modalities into the same space, or by coordinating the representations of each modality through the use of constraints, our approach borrows from both visions. More specifically, assuming each modality can be processed by a separated deep convolutional network, allowing to take decisions independently from each modality, we introduce a central network linking the modality specific networks. This central network not only provides a common feature embedding but also regularizes the modality specific networks through the use of multi-task learning. The proposed approach is validated on 4 different computer vision tasks on which it consistently improves the accuracy of existing multimodal fusion approaches.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Valentin Vielzeuf (17 papers)
  2. Alexis Lechervy (10 papers)
  3. Stéphane Pateux (17 papers)
  4. Frédéric Jurie (27 papers)
Citations (155)