Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fashion Focus: Multi-modal Retrieval System for Video Commodity Localization in E-commerce (2102.04727v1)

Published 9 Feb 2021 in cs.CV

Abstract: Nowadays, live-stream and short video shopping in E-commerce have grown exponentially. However, the sellers are required to manually match images of the selling products to the timestamp of exhibition in the untrimmed video, resulting in a complicated process. To solve the problem, we present an innovative demonstration of multi-modal retrieval system called "Fashion Focus", which enables to exactly localize the product images in the online video as the focuses. Different modality contributes to the community localization, including visual content, linguistic features and interaction context are jointly investigated via presented multi-modal learning. Our system employs two procedures for analysis, including video content structuring and multi-modal retrieval, to automatically achieve accurate video-to-shop matching. Fashion Focus presents a unified framework that can orientate the consumers towards relevant product exhibitions during watching videos and help the sellers to effectively deliver the products over search and recommendation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yanhao Zhang (33 papers)
  2. Qiang Wang (271 papers)
  3. Pan Pan (24 papers)
  4. Yun Zheng (49 papers)
  5. Cheng Da (7 papers)
  6. Siyang Sun (12 papers)
  7. Yinghui Xu (48 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.