Adequacy of learned semantic IDs to replace item IDs

Ascertain whether semantic IDs learned via RQ-VAE from item text embeddings sufficiently capture item semantics to effectively replace traditional item IDs in sequential recommendation models.

Background

Generative retrieval indexes items using semantic IDs derived from content embeddings via RQ-VAE, drastically reducing the number of learned tokens relative to per-item embeddings. This efficiency hinges on the semantic IDs being semantically faithful substitutes for item IDs.

The authors note uncertainty about whether these learned semantic IDs truly encapsulate item meaning to replace item IDs. They explore this by adapting dense retrieval to use semantic IDs, but the fundamental adequacy of semantic IDs as replacements is explicitly stated as unclear.

References

There are notable differences between the two implemented methods: (1) Different Item Indexing and Number of Embeddings: As discussed in Section~\ref{sec:gen_ret_method}, representing $N$ item requires the dense retrieval method to learn and store $\mathcal{O}(N)$ embeddings. In contrast, the semantic-ID-based generative retrieval method only requires $\mathcal{O}(t)$ tokens, where $t^m \approx N$, and $m$ is the length of the semantic ID tuple. However, it remains unclear whether the learned semantic ID can sufficiently capture the item’s semantic meaning and effectively replace the item ID; (2) Text Representation Input: Dense retrieval utilize the item's text representation as additional input; and (3) Prediction Mechanism: Dense retrieval relies on maximum inner product search in the embedding space, whereas generative retrieval predicts the next item through next-token prediction via beam search.

— Unifying Generative and Dense Retrieval for Sequential Recommendation (2411.18814 - Yang et al., 27 Nov 2024) in Section 2.3 (The Observed Performance Difference)

Adequacy of learned semantic IDs to replace item IDs

Background

References

Related Problems