Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Toybox Dataset of Egocentric Visual Object Transformations (1806.06034v3)

Published 15 Jun 2018 in cs.CV

Abstract: In object recognition research, many commonly used datasets (e.g., ImageNet and similar) contain relatively sparse distributions of object instances and views, e.g., one might see a thousand different pictures of a thousand different giraffes, mostly taken from a few conventionally photographed angles. These distributional properties constrain the types of computational experiments that are able to be conducted with such datasets, and also do not reflect naturalistic patterns of embodied visual experience. As a contribution to the small (but growing) number of multi-view object datasets that have been created to bridge this gap, we introduce a new video dataset called Toybox that contains egocentric (i.e., first-person perspective) videos of common household objects and toys being manually manipulated to undergo structured transformations, such as rotation, translation, and zooming. To illustrate potential uses of Toybox, we also present initial neural network experiments that examine 1) how training on different distributions of object instances and views affects recognition performance, and 2) how viewpoint-dependent object concepts are represented within the hidden layers of a trained network.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Xiaohan Wang (91 papers)
  2. Tengyu Ma (117 papers)
  3. James Ainooson (8 papers)
  4. Seunghwan Cha (2 papers)
  5. Xiaotian Wang (38 papers)
  6. Azhar Molla (1 paper)
  7. Maithilee Kunda (15 papers)
Citations (10)

Summary

We haven't generated a summary for this paper yet.