Adequacy of learned semantic IDs to replace item IDs
Ascertain whether semantic IDs learned via RQ-VAE from item text embeddings sufficiently capture item semantics to effectively replace traditional item IDs in sequential recommendation models.
References
There are notable differences between the two implemented methods: (1) Different Item Indexing and Number of Embeddings: As discussed in Section~\ref{sec:gen_ret_method}, representing $N$ item requires the dense retrieval method to learn and store $\mathcal{O}(N)$ embeddings. In contrast, the semantic-ID-based generative retrieval method only requires $\mathcal{O}(t)$ tokens, where $tm \approx N$, and $m$ is the length of the semantic ID tuple. However, it remains unclear whether the learned semantic ID can sufficiently capture the item’s semantic meaning and effectively replace the item ID; (2) Text Representation Input: Dense retrieval utilize the item's text representation as additional input; and (3) Prediction Mechanism: Dense retrieval relies on maximum inner product search in the embedding space, whereas generative retrieval predicts the next item through next-token prediction via beam search.