Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PV2TEA: Patching Visual Modality to Textual-Established Information Extraction (2306.01016v1)

Published 1 Jun 2023 in cs.CL, cs.AI, cs.CV, cs.LG, and cs.MM

Abstract: Information extraction, e.g., attribute value extraction, has been extensively studied and formulated based only on text. However, many attributes can benefit from image-based extraction, like color, shape, pattern, among others. The visual modality has long been underutilized, mainly due to multimodal annotation difficulty. In this paper, we aim to patch the visual modality to the textual-established attribute information extractor. The cross-modality integration faces several unique challenges: (C1) images and textual descriptions are loosely paired intra-sample and inter-samples; (C2) images usually contain rich backgrounds that can mislead the prediction; (C3) weakly supervised labels from textual-established extractors are biased for multimodal training. We present PV2TEA, an encoder-decoder architecture equipped with three bias reduction schemes: (S1) Augmented label-smoothed contrast to improve the cross-modality alignment for loosely-paired image and text; (S2) Attention-pruning that adaptively distinguishes the visual foreground; (S3) Two-level neighborhood regularization that mitigates the label textual bias via reliability estimation. Empirical results on real-world e-Commerce datasets demonstrate up to 11.74% absolute (20.97% relatively) F1 increase over unimodal baselines.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Hejie Cui (33 papers)
  2. Rongmei Lin (11 papers)
  3. Nasser Zalmout (8 papers)
  4. Chenwei Zhang (60 papers)
  5. Jingbo Shang (141 papers)
  6. Carl Yang (130 papers)
  7. Xian Li (116 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.