Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 150 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 105 tok/s Pro
Kimi K2 185 tok/s Pro
GPT OSS 120B 437 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

TrajMamba: Efficient Trajectory Learning

Updated 27 October 2025
  • TrajMamba is a dual-branch selective state-space model that fuses continuous GPS dynamics and discrete road data for efficient trajectory analysis.
  • Its travel purpose-aware pre-training leverages textual embeddings and contrastive InfoNCE alignment to integrate semantic insights without increasing inference cost.
  • The method employs learnable mask-based compression and knowledge distillation to eliminate redundant data, enhancing computational efficiency and prediction accuracy.

TrajMamba denotes a class of efficient, semantically rich methods for vehicle trajectory learning centered on the Traj-Mamba architecture—a dual-branch selective state-space model (SSM) incorporating GPS and road perspectives with specialized pre-training schemes for travel purpose integration and data reduction. TrajMamba is designed to extract movement patterns and embed travel semantics from vehicle GPS trajectories while optimizing for computational efficiency and robust generalization, making it well-suited for large-scale intelligent transportation applications (Liu et al., 20 Oct 2025).

1. Traj-Mamba Encoder Architecture

At the core of TrajMamba is the Traj-Mamba encoder, which jointly models both the continuous movement dynamics and the contextual semantics of vehicle trips:

  • Dual-branch SSM design:
    • The encoder accepts two types of input features for each trajectory:
    • GPS perspective: raw spatial and temporal data, supplemented with high-order movement features such as instantaneous speed viv_i, acceleration acci\text{acc}_i, and turning angle θi\theta_i computed for every timestamp.
    • Road perspective: discrete identifiers of traversed road segments and encoded cyclic temporal features (e.g., hour of day, day of week).
    • Architecture: Multiple Traj-Mamba blocks are stacked, each containing
    • A GPS-SSM branch: input projection, causal convolution, and a selective SSM parameterized by movement features.
    • A Road-SSM branch: analogous processing for road-related features.
    • Input-dependent parameterization:
    • Each SSM branch computes matrices BB, CC, and gating Δ\Delta via learned projections from high-order features, e.g.,

    B=Linear(ST),C=Linear(ST),Δ=σΔ(Linear(ST)+bΔ)B = \text{Linear}(S_\mathcal{T}),\quad C = \text{Linear}(S_\mathcal{T}),\quad \Delta = \sigma_{\Delta}(\text{Linear}(S_\mathcal{T}) + b_{\Delta})

    where STS_\mathcal{T} denotes the high-order feature sequence and σΔ\sigma_{\Delta} is Softplus or SiLU.

  • Feature fusion:

The GPS and road latent embeddings are fused using a dot-product gating mechanism. For example, Zig=Linear(RMSNorm(YigXir))Z_i^g = \text{Linear}(\text{RMSNorm}(Y_i^g \odot X_i^r)) where YigY_i^g and XirX_i^r are the outputs from the respective SSM branches.

  • Output trajectory embedding:

The final representation zTz_\mathcal{T} is the concatenation and mean-pooling of the fused outputs. This embedding robustly encodes both movement patterns and spatial context while maintaining linear complexity in trajectory length.

2. Travel Purpose-aware Pre-training

To enrich embeddings with travel purpose semantics without affecting inference cost, TrajMamba introduces a two-stage pre-training strategy:

  • Textual pre-training branches:

    • For each trajectory, road segments and surrounding POIs are embedded using a shared pre-trained textual embedding model, applied to raw textual attributes (e.g., names, POI descriptions).
    • Embeddings are contextually enriched through local aggregation and global context using learnable aggregation functions.
    • z^ei=zei+AggRoad({zej},ze1,zen)\hat{z}_{e_i} = z_{e_i} + \text{Agg}^{\text{Road}}(\{z_{e_j}\}, z_{e_1}, z_{e_n})
    • z^pi=zpi+AggPOI({zpj},zp1,zpn)+EPid(pi)\hat{z}_{p_i} = z_{p_i} + \text{Agg}^{\text{POI}}(\{z_{p_j}\}, z_{p_1}, z_{p_n}) + E_{\text{Pid}}(p_i)
  • Semantic view extraction:

Each road and POI view is summarized via dedicated Mamba blocks and mean pooling, yielding compressed views zRoadz^{\text{Road}} and zPOIz^{\text{POI}} that encode the trip’s geographic and functional semantics.

  • Contrastive InfoNCE-based alignment:

The main trajectory embedding zTz_\mathcal{T} is aligned with both textual views using an InfoNCE loss (with learnable temperature TT). This ensures that zTz_\mathcal{T} implicitly encodes the trip’s underlying purpose. -

LInfoNCE=1Ni=1Nlogexp(sim(zTi,zviewi)/T)j=1Nexp(sim(zTi,zviewj)/T)\mathcal{L}_{\text{InfoNCE}} = -\frac{1}{N} \sum_{i=1}^N \log \frac{\exp(\text{sim}(z_\mathcal{T}^i, z_{\text{view}}^i)/T)}{\sum_{j=1}^N \exp(\text{sim}(z_\mathcal{T}^i, z_{\text{view}}^j)/T)}

Crucially, the text-based branches are only employed during pre-training. At inference, only the highly efficient TrajMamba encoder is used, incurring no extra cost for semantic integration.

3. Knowledge Distillation and Trajectory Compression

TrajMamba incorporates a knowledge distillation scheme to both identify key points in trajectories and compress them for fast, high-quality embedding:

  • Rule-based preprocessing:

Candidate redundant or non-informative trajectory points (e.g., during vehicle idleness or constant-velocity travel) are pruned from the raw input.

  • Learnable mask generation:

For the remaining points, a soft mask vector mm is learned via mi=g(μi)=max(0,min(1,μi+ϵ))m_i = g(\mu_i) = \max(0, \min(1, \mu_i + \epsilon)) where μ\mu are learnable parameters and ϵ\epsilon is Gaussian noise during training to enforce robustness and sparsity.

  • Compressed trajectory embedding:

The masked trajectory is passed through a new TrajMamba encoder (initialized from a travel purpose pre-trained teacher).

  • Multi-view entropy coding (MEC) loss with mask penalty:

The distillation loss combines a MEC loss (aligning compressed and teacher embeddings) and a penalty that encourages sparsity in the mask:

L=12(LMEC+Lmask)\mathcal{L} = \frac{1}{2} (\mathcal{L}_{\text{MEC}} + \mathcal{L}_{\text{mask}})

This method ensures that only the most semantically discriminative trajectory points are retained, resulting in smaller, more informative embeddings (Liu et al., 20 Oct 2025).

4. Experimental Evaluation

Evaluation was conducted on two large-scale real-world taxi trajectory datasets (Chengdu and Xian) and three key downstream tasks:

  • Destination Prediction (DP):

TrajMamba predicts both GPS coordinates and road segment endpoints from truncated trajectories. The method reduces GPS coordinate errors by up to 45% (Chengdu) and 26% (Xian) compared to the leading baseline JGRM. Road segment prediction accuracy gains are 9–10% over the same baseline.

  • Arrival Time Estimation (ATE):

TrajMamba achieves the lowest mean absolute error (MAE) and mean percentage error (MAPE) of all compared methods.

  • Similar Trajectory Search (STS):

Using cosine similarity of embeddings, TrajMamba achieves the highest Acc@1/Acc@5 and the lowest mean rank, indicating more meaningful trajectory representation.

The encoder’s computational efficiency is underscored by substantial reductions in embedding time and model size relative to Transformer-based models, readily supporting real-time deployment.

Task Improvement over JGRM Notes
Destination Prediction 45% (Chengdu, GPS error), 9–10% (road seg. acc.) Lower error, higher acc.
Arrival Time Estimate Lowest MAE, MAPE Outperforms all baselines
Trajectory Search Highest Acc@1/5 Best mean rank

5. Comparative Context and Distinctive Features

Within the landscape of trajectory representation learning and semantic trajectory analysis—spanning approaches such as RNNs, Transformers, trajectory2vec, and road-based contrastive learning—TrajMamba introduces several distinctive advances:

  • SSM-based encoding:

The dual-branch Traj-Mamba SSM model supports linear time complexity, outperforming Transformer-based encoders on both accuracy and scalability.

  • Travel purpose fusion:

The pre-training regimen achieves integration of textual travel purpose semantics without incurring inference-time cost, in contrast to models requiring heavy language modeling branches during prediction.

  • Automated compression:

The learnable mask-based compression, guided by a knowledge distillation teacher, systematically reduces redundancy in dense, real-world GPS trajectories, which directly improves both computational efficiency and representation quality.

  • Generalization and Transferability:

The resulting embeddings retain strong transferability across tasks, as shown by high performance in prediction, estimation, and search scenarios.

6. Applications and Implications

The design and empirical performance of TrajMamba have direct consequences for a gamut of intelligent transportation and urban mobility systems:

  • Ride-hailing and mobility-on-demand:

Efficiently predicting destinations or travel times from partial trip data allows for optimized dispatch and dynamic pricing.

  • Urban planning and analytics:

Rich travel purpose-aware embeddings support land-use inference, infrastructure design, and behavioral analysis.

  • Real-time anomaly detection:

Compact, semantically meaningful trajectory representations facilitate large-scale streaming analysis for fraud, safety, or congestion detection.

  • Flexible transfer to new tasks:

The approach is amenable to trajectory clustering, next-location recommendation, and new forms of multi-modal spatio-temporal querying.

A plausible implication is that the paradigm set by TrajMamba—dual perspective modeling, semantic-centric pre-training decoupled from inference, and end-to-end compression—may guide future architectures in trajectory intelligence, particularly as application scales and semantic complexity increase.

7. Summary Table: TrajMamba Workflow Components

Component Role Computational Impact
Traj-Mamba Encoder Dual GPS/road SSM feature aggregation Linear in trajectory size
Travel Purpose Pre-training Embeds semantics via contrastive infoNCE Training-only (no inference cost)
Mask-based Compression Redundant point removal/feature distillation Lowered embedding time
Downstream Task Inference Predicts endpoints, times, matches Accelerated, accurate

In summary, TrajMamba exemplifies an efficient, scalable, and semantically enriched trajectory analysis framework, validated by empirical results and distinct architectural design enabling practical deployment in large-scale intelligent mobility contexts (Liu et al., 20 Oct 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to TrajMamba.