2000 character limit reached
Image Representations and New Domains in Neural Image Captioning (1508.02091v1)
Published 9 Aug 2015 in cs.CL and cs.CV
Abstract: We examine the possibility that recent promising results in automatic caption generation are due primarily to LLMs. By varying image representation quality produced by a convolutional neural network, we find that a state-of-the-art neural captioning algorithm is able to produce quality captions even when provided with surprisingly poor image representations. We replicate this result in a new, fine-grained, transfer learned captioning domain, consisting of 66K recipe image/title pairs. We also provide some experiments regarding the appropriateness of datasets for automatic captioning, and find that having multiple captions per image is beneficial, but not an absolute requirement.
- Jack Hessel (50 papers)
- Nicolas Savva (1 paper)
- Michael J. Wilber (6 papers)