Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Label Product Categorization Using Multi-Modal Fusion Models (1907.00420v2)

Published 30 Jun 2019 in cs.LG, cs.CV, and stat.ML

Abstract: In this study, we investigated multi-modal approaches using images, descriptions, and titles to categorize e-commerce products on Amazon. Specifically, we examined late fusion models, where the modalities are fused at the decision level. Products were each assigned multiple labels, and the hierarchy in the labels were flattened and filtered. For our individual baseline models, we modified a CNN architecture to classify the description and title, and then modified Keras' ResNet-50 to classify the images, achieving $F_1$ scores of 77.0%, 82.7%, and 61.0%, respectively. In comparison, our tri-modal late fusion model can classify products more effectively than single modal models can, improving the $F_1$ score to 88.2%. Each modality complemented the shortcomings of the other modalities, demonstrating that increasing the number of modalities can be an effective method for improving the performance of multi-label classification problems.

Citations (14)

Summary

We haven't generated a summary for this paper yet.