Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cosine Similarity of Multimodal Content Vectors for TV Programmes (2009.11129v1)

Published 23 Sep 2020 in cs.MM, cs.CL, cs.IR, and cs.LG

Abstract: Multimodal information originates from a variety of sources: audiovisual files, textual descriptions, and metadata. We show how one can represent the content encoded by each individual source using vectors, how to combine the vectors via middle and late fusion techniques, and how to compute the semantic similarities between the contents. Our vectorial representations are built from spectral features and Bags of Audio Words, for audio, LSI topics and Doc2vec embeddings for subtitles, and the categorical features, for metadata. We implement our model on a dataset of BBC TV programmes and evaluate the fused representations to provide recommendations. The late fused similarity matrices significantly improve the precision and diversity of recommendations.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Saba Nazir (1 paper)
  2. Taner Cagali (1 paper)
  3. Chris Newell (4 papers)
  4. Mehrnoosh Sadrzadeh (51 papers)