Large-scale Classification of Fine-Art Paintings: Learning The Right Metric on The Right Feature (1505.00855v1)

Published 5 May 2015 in cs.CV, cs.IR, cs.LG, and cs.MM

Abstract: In the past few years, the number of fine-art collections that are digitized and publicly available has been growing rapidly. With the availability of such large collections of digitized artworks comes the need to develop multimedia systems to archive and retrieve this pool of data. Measuring the visual similarity between artistic items is an essential step for such multimedia systems, which can benefit more high-level multimedia tasks. In order to model this similarity between paintings, we should extract the appropriate visual features for paintings and find out the best approach to learn the similarity metric based on these features. We investigate a comprehensive list of visual features and metric learning approaches to learn an optimized similarity measure between paintings. We develop a machine that is able to make aesthetic-related semantic-level judgments, such as predicting a painting's style, genre, and artist, as well as providing similarity measures optimized based on the knowledge available in the domain of art historical interpretation. Our experiments show the value of using this similarity measure for the aforementioned prediction tasks.

Citations (261)

View on Semantic Scholar

Summary

The paper demonstrates that learned metrics integrated with visual and semantic features significantly improve fine-art classification accuracy.
It employs a dataset of over 81,000 paintings and compares metric learning methods like LMNN, ITML, and NCA.
A notable outcome is a 45.97% style classification accuracy, highlighting the benefits of high-level semantic features and feature fusion.

Overview of "Large-scale Classification of Fine-Art Paintings: Learning The Right Metric on The Right Feature"

The paper, authored by Babak Saleh and Ahmed Elgammal from Rutgers University, addresses the burgeoning availability of digitized fine-art collections and the accompanying necessity for advanced multimedia systems for their organization and retrieval. It aims to systematically explore visual features and metric learning methodologies to optimize similarity measures between paintings, with a focus on semantic-level classification tasks such as predicting a painting's style, genre, and artist.

Key Elements of the Study

The authors investigate a wide array of visual features, spanning from low-level features like GIST to high-level, semantic features such as Classemes and Picodes. Moreover, they experiment with several metric learning approaches, including Neighborhood Component Analysis (NCA), Large Margin Nearest Neighbors (LMNN), and Information Theoretic Metric Learning (ITML), to project paintings into optimized feature spaces conducive to the aforementioned classification tasks.

Structure and Methodology

The research employs a dataset of over 81,000 paintings sourced from "Wikiart," organized into 27 styles, 45 genres, and works by 1,119 artists, thus making it one of the largest datasets for the field. The paper's methodology consists of three main strategies:

Metric Learning: Distinct metrics are learned for mapping paintings into optimized feature spaces suitable for style, genre, and artist classification.
Feature Fusion: Multiple types of features are projected using learned metrics and then concatenated for classification purposes.
Metric Fusion: A single type of feature is projected using various metric learning techniques, and the resulting spaces are combined for classification.

The experiments validate the superiority of employing learned metrics over baseline classifiers that use raw visual features, with noticeable improvement in classification accuracies when applying metric learning techniques.

Significant Results

The paper finds that high-level semantic features such as Classemes consistently outperform other features across all classification tasks. Moreover, the authors note that the Boost metric and ITML are particularly effective across features, showing consistent improvement over baseline classification results. The LMNN metric, when applied in conjunction with feature fusion, yields the highest classification accuracy, notably surpassing existing work by achieving a 45.97% accuracy in style classification—a result that also entails a significant reduction in feature vector dimensionality compared to previous approaches.

Implications and Future Work

The implications of this paper are dual: practical and theoretical. On the practical side, optimized metrics for fine-art classification can significantly enhance art recommendation systems and digital archiving, offering fine-grained differentiation between styles, genres, and artists. Theoretically, the work underlines the importance of selecting appropriate feature and metric combinations, a consideration that transcends the domain of art to impact image classification more broadly.

In terms of future developments, the authors suggest extending the applicability of learned metrics to tasks like image retrieval and recommendation systems. Moreover, exploring different annotations, such as the period of creation, as a basis for metric learning is proposed as a subsequent step.

Overall, this research makes a valuable contribution to the computer-based analysis of art, presenting a well-evaluated framework and a demonstrable improvement over previous methodologies in classifying digitized art collections.

PDF Markdown