Recognizing Image Style (1311.3715v3)

Published 15 Nov 2013 in cs.CV

Abstract: The style of an image plays a significant role in how it is viewed, but style has received little attention in computer vision research. We describe an approach to predicting style of images, and perform a thorough evaluation of different image features for these tasks. We find that features learned in a multi-layer network generally perform best -- even when trained with object class (not style) labels. Our large-scale learning methods results in the best published performance on an existing dataset of aesthetic ratings and photographic style annotations. We present two novel datasets: 80K Flickr photographs annotated with 20 curated style labels, and 85K paintings annotated with 25 style/genre labels. Our approach shows excellent classification performance on both datasets. We use the learned classifiers to extend traditional tag-based image search to consider stylistic constraints, and demonstrate cross-dataset understanding of style.

Citations (450)

View on Semantic Scholar

Summary

The paper demonstrates that mid-level CNN features, initially trained for object recognition, significantly enhance image style classification.
It introduces two extensive datasets, Flickr Style (80K photos) and Wikipaintings (85K artworks), for evaluating style recognition performance.
The findings suggest improved image retrieval and tagging applications by integrating style recognition into conventional computer vision pipelines.

Recognizing Image Style: An Expert Overview

The paper "Recognizing Image Style" by Karayev et al. addresses the foundational issue of image style recognition within computer vision, a field traditionally focused on object and scene recognition. Despite the significance of style in conveying meaning, it has been underexplored. The authors contribute two extensive datasets, "Flickr Style" and "Wikipaintings," and demonstrate superior classification performance on these using deep learning techniques.

Methodology and Findings

The authors' approach involves leveraging features learned from deep convolutional neural networks (CNNs), trained initially for object classification, and applying them to style recognition. The paper reports significant improvements over traditional hand-crafted features, such as color histograms and GIST descriptors. This finding suggests that mid-level CNN features generalize well to the task of style classification, even when trained on non-style-related tasks.

Datasets

Flickr Style: This dataset includes 80,000 photographs annotated with 20 style labels. Styles span across various categories such as photographic techniques (e.g., HDR), moods (e.g., Melancholy), and genres (e.g., Noir).
Wikipaintings: With 85,000 images labeled across 25 historical art styles, this dataset provides a substantial resource for analyzing artistic styles from different time periods.

Technical Evaluation

The evaluation demonstrates that CNN features outperform other features across datasets. On the AVA style dataset, for instance, the CNN-derived DeCAF features achieved a mean Average Precision (AP) of 0.579, surpassing previous benchmarks of 0.538. The application of late feature fusion further enhances classification performance, confirming the versatility of CNN-derived features for style recognition tasks.

Practical Implications

The research underscores the utility of style classification in augmenting traditional image search systems by incorporating stylistic constraints. This could enhance applications like image retrieval, tagging, and content-based filtering. Additionally, the experimental validation using Amazon Mechanical Turk (MTurk) highlights that machine classifiers can match human-level accuracy in style recognition tasks, further validating their applicability in real-world scenarios.

Future Directions

The promising results of using CNN features from object recognition tasks for style classification invite further investigation into more specialized network architectures that could potentially capture stylistic nuances more effectively. Additionally, bridging the content-style correlation offers a fertile avenue for future research, where style recognition could be dynamically adjusted based on contextual content.

Conclusion

"Recognizing Image Style" asserts the potential of CNN-based methods in tackling the complex problem of style recognition in images. The introduction of comprehensive datasets and the demonstration of effective classification techniques pave the way for advanced research in this domain, with implications spanning aesthetics, art history, and beyond. This paper thus provides a crucial step towards integrating artistic and stylistic appreciation into the computational understanding of images.

PDF Markdown