Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Identifying First-person Camera Wearers in Third-person Videos (1704.06340v1)

Published 20 Apr 2017 in cs.CV

Abstract: We consider scenarios in which we wish to perform joint scene understanding, object tracking, activity recognition, and other tasks in environments in which multiple people are wearing body-worn cameras while a third-person static camera also captures the scene. To do this, we need to establish person-level correspondences across first- and third-person videos, which is challenging because the camera wearer is not visible from his/her own egocentric video, preventing the use of direct feature matching. In this paper, we propose a new semi-Siamese Convolutional Neural Network architecture to address this novel challenge. We formulate the problem as learning a joint embedding space for first- and third-person videos that considers both spatial- and motion-domain cues. A new triplet loss function is designed to minimize the distance between correct first- and third-person matches while maximizing the distance between incorrect ones. This end-to-end approach performs significantly better than several baselines, in part by learning the first- and third-person features optimized for matching jointly with the distance measure itself.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Chenyou Fan (27 papers)
  2. Jangwon Lee (12 papers)
  3. Mingze Xu (28 papers)
  4. Krishna Kumar Singh (46 papers)
  5. Yong Jae Lee (88 papers)
  6. David J. Crandall (19 papers)
  7. Michael S. Ryoo (75 papers)
Citations (61)

Summary

We haven't generated a summary for this paper yet.