Papers
Topics
Authors
Recent
Search
2000 character limit reached

CROSS-JEM: Joint Modeling for Ranking & UHE Neutrinos

Updated 8 March 2026
  • CROSS-JEM is a dual-concept approach that jointly encodes candidate short texts, significantly reducing computational latency in neural ranking.
  • It employs a novel listwise Ranking Probability Loss and token union strategy to achieve state-of-the-art performance on public and proprietary benchmarks.
  • In astroparticle physics, CROSS-JEM leverages quasi-horizontal air shower data to constrain models of warped extra-dimensional gravity via UHE cosmic neutrinos.

CROSS-JEM refers to two technically distinct concepts within the research literature: (1) Cross-Encoders with Joint Efficient Modeling in the context of neural short-text ranking for search and recommendation (Paliwal et al., 2024), and (2) the use of cross-section measurements at JEM-EUSO to probe warped extra-dimensional gravity models via ultra-high-energy cosmic neutrinos (Mladenov et al., 2015). While unrelated in immediate subject matter, both leverage “cross-” or “joint” modeling to advance state-of-the-art performance or sensitivity within their domains.

1. Cross-Encoders with Joint Efficient Modeling (CROSS-JEM) for Short-Text Ranking

CROSS-JEM is a Transformer-based neural ranking architecture designed to efficiently and accurately rank sets of short text items (such as ad keywords, web page titles, tag phrases) based on query relevance. Unlike standard cross-encoders, which process each query-item pair independently and thus incur high computational cost and ignore listwise interactions, CROSS-JEM jointly encodes an entire candidate set in a single Transformer pass by exploiting token redundancies across items. This joint modeling significantly reduces inference latency and enables direct listwise optimization (Paliwal et al., 2024).

2. Design and Methodology of CROSS-JEM

2.1. Token Union and Joint Encoding

For each query qq and associated short-text item set Kq={k1,,kN}K_q = \{k_1, \ldots, k_N\}:

  • Tokenize qq to TqT_q.
  • Tokenize each item kjk_j to TkjT_{k_j}, then construct the set-union:

TU=j=1NTkj,TUj=1NTkjT_U = \bigcup_{j=1}^N T_{k_j}, \quad |T_U| \ll \sum_{j=1}^N |T_{k_j}|

  • Form a single concatenated input sequence: [CLS]Tq[SEP]TU[SEP][\text{CLS}]\,T_q\,[\text{SEP}]\,T_U\,[\text{SEP}]
  • Encode jointly:

E=Encoder([CLS,Tq,SEP,TU,SEP])RL×dE = \mathrm{Encoder}([\mathrm{CLS}, T_q, \mathrm{SEP}, T_U, \mathrm{SEP}]) \in \mathbb{R}^{L \times d}

where L=1+Tq+1+TU+1L = 1 + |T_q| + 1 + |T_U| + 1.

2.2. Selective Pooling and Joint Scoring

For each item kjk_j, the model identifies token positions corresponding to TqT_q, TkjT_{k_j} (as realized in TUT_U), and the separating token (SEP), and pools their encoded vectors:

Pj={positions for Tq}{positions in TUTkj}{SEP}P_j = \{ \text{positions for } T_q \} \cup \{ \text{positions in } T_U \cap T_{k_j} \} \cup \{\text{SEP}\}

Pooled vector for item kjk_j:

vj=1PjtPjEtv_j = \frac{1}{|P_j|} \sum_{t \in P_j} E_t

All NN item logits are then computed in a single matrix multiplication:

sj=w,vj,j=1,,Ns_j = \langle w, v_j \rangle,\quad j = 1,\ldots,N

where ww is a shared linear projection.

2.3. Training Objective: Ranking Probability Loss

CROSS-JEM introduces the listwise Ranking Probability Loss (RPL) to optimize over the correct global ranking order: Let yi,jy_{i,j} denote the ground-truth relevance for query ii, item jj, and fi,jf_{i,j} their predicted logit. For position jj, define:

Lj(i)={k  yi,k<yi,j}L_j(i) = \{ k\ |\ y_{i,k} < y_{i,j} \}

Then,

LRPL(θ,w)=i=1Qtrj=1NkLj(i)yi,klogexp(fi,k)Lj(i)exp(fi,)L_{\mathrm{RPL}}(\theta, w) = -\sum_{i=1}^{|Q_{\mathrm{tr}}|} \sum_{j=1}^N \sum_{k \in L_j(i)} y_{i,k} \log \frac{\exp(f_{i,k})}{\sum_{\ell\in L_j(i)}\exp(f_{i,\ell})}

This loss is mathematically equivalent to minimizing the KL-divergence to the ground-truth top-1 distribution as in the ListNet top-1 objective.

3. Computational Efficiency via Token Redundancy

Empirical studies reveal that candidate sets of N=100700N=100-700 short-text items typically exhibit a 5–10×\times overlap in subword tokens, so TU1CjTkj,C510|T_U| \approx \dfrac{1}{C}\sum_j |T_{k_j}|,\, C\sim 5-10. As a result, the computational complexity drops from O(N(Lq+Lk)2Llayers)\mathcal{O}(N(L_q + L_k)^2 L_\text{layers}) for standard cross-encoders to O((Lq+TU)2Llayers)\mathcal{O}((L_q + |T_U|)^2 L_\text{layers}) for CROSS-JEM. This results in approximately 4×\times lower latency in practice when N=100,C11N=100,\, C\approx 11.

Model Inference Latency (700 items, A100) Throughput (pairs/sec)
monoBERT 41.3 ms 3.35K
CROSS-JEM 9.8 ms 17.2K
CPU Sparse Models ~300 ms

4. Empirical Performance and Evaluation

CROSS-JEM demonstrates state-of-the-art accuracy and substantial efficiency gains on both public and proprietary datasets.

  • Public Benchmarks (N=10):
    • SODQ: MAP@5 = 52.40% (CROSS-JEM) vs. 48.31% (ANCE), 46.79% (monoBERT)
    • MS MARCO-Titles: MRR@10 = 35.45% (CROSS-JEM), 32.47% (monoBERT), 30.55% (INSTRUCTOR)
  • Sponsored Search (N=700, proprietary):
    • MAP@100 = 97.48% (CROSS-JEM) vs. 84.38% (MEB), 78.39% (ANCE)
    • Negative Accuracy (retaining 80% positives): 99.45%
    • Live A/B: quick-back-rate reduced by 1.8%, judged relevance improved by 10.2%
  • Ablations show RPL achieves superior MRR@10 (35.45%) compared to BCE (31.46%), CE-listwise (32.03%), or vanilla ListNet (30.27%).

5. Implications for Production Systems

CROSS-JEM’s joint encoding and listwise training admit several properties critical for high-throughput, real-time ranking scenarios:

  • One-pass encoding for all candidates eliminates the multiplicative cost with NN present in standard architectures.
  • Retains full parameter efficiency and item list calibration, absent from dual-encoder or late-interaction methods.
  • Achieves sub-10 ms latency for hundreds of candidates on commodity accelerators.
  • Compatible with small BERT-base backbones, obviating the need for LLM-scale parameter counts.
  • Results in direct improvements to user engagement and advertiser ROI in commercial applications due to higher accuracy and reduced quick-back rates (Paliwal et al., 2024).

6. CROSS-JEM in Astroparticle Physics: JEM-EUSO and Model-Dependent Cross-Section Enhancement

In a separate context, the term "CROSS-JEM" is used as a narrative label for leveraging JEM-EUSO’s quasi-horizontal air shower data (CRoss-section at JEM-EUSO, Editor's term) to constrain or discover signatures of warped extra-dimension gravity as described by the Randall–Sundrum (RS) model (Mladenov et al., 2015).

  • Theoretical Background:
    • The RS model with small 5D curvature (κM5\kappa\ll M_5) predicts an almost continuous spectrum of light reggeized Kaluza-Klein gravitons.
    • Ultra-high-energy neutrino–nucleon (νN\nu N) interactions at sM5\sqrt{s}\gtrsim M_5 are dominated by tt-channel gravi-Reggeon exchanges, resulting in strongly enhanced cross-sections at Eν1019E_\nu \gtrsim 10^{19} eV.
  • Event Rate Prediction at JEM-EUSO:
    • For M54M_5 \lesssim 4 TeV, κ1\kappa\sim 1 GeV, the predicted number of quasi-horizontal air showers exceeds a few events annually, significantly above the SM expectation (\sim0.06 yr1^{-1}).
    • Example event rate table (one year, κ=1\kappa=1 GeV):
M5M_5 (TeV) Expected Events (yr1^{-1})
3 6.7
4 1.2
5 0.31

- A null result (Nobs=0N_\text{obs}=0) would set a lower bound M54M_5 \gtrsim 4 TeV (95%95\% CL), while any significant upward deviation from the SM would signal new trans-Planckian physics.

  • Sensitivity and Systematics:
    • Statistical uncertainty is Poissonian; 5-year exposure improves reach over ground arrays by an order of magnitude.
    • Dominant systematics: flux normalization (factor of \sim2), exposure uncertainty (\sim15%), model uncertainties in high-ss gravity (\sim20%).

A plausible implication is that the “CROSS-JEM” analysis exemplifies the synergy of collider-inspired BSM theory and space-based cosmic ray observatories for probing fundamental quantum gravity effects at unprecedented energies (Mladenov et al., 2015).

7. Summary and Outlook

CROSS-JEM, in the neural ranking setting, bridges the gap between the high accuracy of cross-encoders and the low latency required for practical deployment in short-text ranking by (1) jointly encoding all items in one Transformer pass using a token union, (2) selectively pooling per-item embeddings, and (3) training with a novel listwise ranking loss. In the context of astroparticle physics, "CROSS-JEM" describes a methodology for extracting strong constraints on extra-dimensional gravity by linking reggeized graviton exchange modifications in νN\nu N cross-sections to observed air shower rates at JEM-EUSO. Both research lines demonstrate the power of joint, cross-item modeling to surpass prior limitations, either in computational efficiency, system accuracy, or new physics sensitivity (Paliwal et al., 2024, Mladenov et al., 2015).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to CROSS-JEM.