Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CERES: Pretraining of Graph-Conditioned Transformer for Semi-Structured Session Data (2204.04303v1)

Published 8 Apr 2022 in cs.IR

Abstract: User sessions empower many search and recommendation tasks on a daily basis. Such session data are semi-structured, which encode heterogeneous relations between queries and products, and each item is described by the unstructured text. Despite recent advances in self-supervised learning for text or graphs, there lack of self-supervised learning models that can effectively capture both intra-item semantics and inter-item interactions for semi-structured sessions. To fill this gap, we propose CERES, a graph-based transformer model for semi-structured session data. CERES learns representations that capture both inter- and intra-item semantics with (1) a graph-conditioned masked language pretraining task that jointly learns from item text and item-item relations; and (2) a graph-conditioned transformer architecture that propagates inter-item contexts to item-level representations. We pretrained CERES using ~468 million Amazon sessions and find that CERES outperforms strong pretraining baselines by up to 9% in three session search and entity linking tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Rui Feng (67 papers)
  2. Chen Luo (77 papers)
  3. Qingyu Yin (44 papers)
  4. Bing Yin (56 papers)
  5. Tuo Zhao (131 papers)
  6. Chao Zhang (907 papers)
Citations (2)