Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Identity-Aware Textual-Visual Matching with Latent Co-attention (1708.01988v1)

Published 7 Aug 2017 in cs.CV

Abstract: Textual-visual matching aims at measuring similarities between sentence descriptions and images. Most existing methods tackle this problem without effectively utilizing identity-level annotations. In this paper, we propose an identity-aware two-stage framework for the textual-visual matching problem. Our stage-1 CNN-LSTM network learns to embed cross-modal features with a novel Cross-Modal Cross-Entropy (CMCE) loss. The stage-1 network is able to efficiently screen easy incorrect matchings and also provide initial training point for the stage-2 training. The stage-2 CNN-LSTM network refines the matching results with a latent co-attention mechanism. The spatial attention relates each word with corresponding image regions while the latent semantic attention aligns different sentence structures to make the matching results more robust to sentence structure variations. Extensive experiments on three datasets with identity-level annotations show that our framework outperforms state-of-the-art approaches by large margins.

Author Guidelines for ICCV Proceedings: An Analytical Overview

The document under review presents a comprehensive set of author guidelines intended for the preparation and submission of manuscripts to the International Conference on Computer Vision (ICCV) proceedings. This technical specification is essential in maintaining a consistent and professional format across contributions to ICCV.

Summary of Key Elements

Focusing primarily on formatting requirements, the paper delineates a multifaceted approach that authors must adhere to when drafting their manuscripts. Key highlights include:

  • Standard Document Format: All manuscripts must conform to a two-column layout, utilizing specific margin specifications. The text should be comprised of 10-point Times typeface, fully justified, with predefined header and footer margins.
  • Paper Length Constraints: Contributors are limited to an eight-page document for the main body of the text, excluding references which may extend onto a ninth page. This ensures brevity and focus in presentation, catering to the time-constrained nature of the reviewing process.
  • Blind Review Mandates: To ensure impartiality in peer review, authors must avoid self-identifying references within the text, while maintaining relevant citations for context consistency.
  • Figure and Table Presentation: Visual elements must be suitably formatted to ensure clarity in both digital and printed forms. This includes the specification of font sizes and alignment within the layout.
  • Mathematical Representation: Careful attention is required in numbering equations and employing standard notation to facilitate accurate referencing by readers.

Implications and Speculative Developments

The outlined guidelines serve to streamline the submission and review process, promoting an efficient and equitable evaluation of scientific contributions. This uniformity is particularly beneficial in the increasingly collaborative and interdisciplinary nature of computer vision research, allowing for easier cross-verification and comprehension of methodologies.

Adherence to such rigorous guidelines as those laid out by ICCV not only enhances the readability of each paper but can also improve the overall impact and dissemination of research findings within the broader scientific community. While these guidelines are technical in nature, they indirectly foster the pursuit of higher scientific standards through their demand for conciseness and clarity.

As the field evolves, we might anticipate further refinements in the guidelines to accommodate advancements in automated formatting tools and the integration of dynamic, multimedia elements that enhance the presentation of interactive content. Continued enhancement of submission protocols can also leverage AI for preliminary compliance checks, ensuring submissions meet the requisite specifications prior to human review.

In conclusion, the authors present a meticulous framework for manuscript submissions to the ICCV, guaranteeing consistency and quality in the dissemination of scholarly work. As the field progresses, these guidelines will likely evolve in tandem with technological innovations and practices within scientific publishing.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Shuang Li (203 papers)
  2. Tong Xiao (119 papers)
  3. Hongsheng Li (340 papers)
  4. Wei Yang (349 papers)
  5. Xiaogang Wang (230 papers)
Citations (217)