TrajMamba: Efficient Trajectory Learning

Updated 27 October 2025

TrajMamba is a dual-branch selective state-space model that fuses continuous GPS dynamics and discrete road data for efficient trajectory analysis.
Its travel purpose-aware pre-training leverages textual embeddings and contrastive InfoNCE alignment to integrate semantic insights without increasing inference cost.
The method employs learnable mask-based compression and knowledge distillation to eliminate redundant data, enhancing computational efficiency and prediction accuracy.

TrajMamba denotes a class of efficient, semantically rich methods for vehicle trajectory learning centered on the Traj-Mamba architecture—a dual-branch selective state-space model (SSM) incorporating GPS and road perspectives with specialized pre-training schemes for travel purpose integration and data reduction. TrajMamba is designed to extract movement patterns and embed travel semantics from vehicle GPS trajectories while optimizing for computational efficiency and robust generalization, making it well-suited for large-scale intelligent transportation applications (Liu et al., 20 Oct 2025).

1. Traj-Mamba Encoder Architecture

At the core of TrajMamba is the Traj-Mamba encoder, which jointly models both the continuous movement dynamics and the contextual semantics of vehicle trips:

Dual-branch SSM design:
- The encoder accepts two types of input features for each trajectory:
- GPS perspective: raw spatial and temporal data, supplemented with high-order movement features such as instantaneous speed $v_i$ , acceleration $\text{acc}_i$ , and turning angle $\theta_i$ computed for every timestamp.
- Road perspective: discrete identifiers of traversed road segments and encoded cyclic temporal features (e.g., hour of day, day of week).
- Architecture: Multiple Traj-Mamba blocks are stacked, each containing
- A GPS-SSM branch: input projection, causal convolution, and a selective SSM parameterized by movement features.
- A Road-SSM branch: analogous processing for road-related features.
- Input-dependent parameterization:
- Each SSM branch computes matrices $B$ , $C$ , and gating $\Delta$ via learned projections from high-order features, e.g.,
$B = \text{Linear}(S_\mathcal{T}),\quad C = \text{Linear}(S_\mathcal{T}),\quad \Delta = \sigma_{\Delta}(\text{Linear}(S_\mathcal{T}) + b_{\Delta})$

where $S_\mathcal{T}$ denotes the high-order feature sequence and $\sigma_{\Delta}$ is Softplus or SiLU.
Feature fusion:

The GPS and road latent embeddings are fused using a dot-product gating mechanism. For example, $Z_i^g = \text{Linear}(\text{RMSNorm}(Y_i^g \odot X_i^r))$ where $Y_i^g$ and $X_i^r$ are the outputs from the respective SSM branches.

Output trajectory embedding:

The final representation $z_\mathcal{T}$ is the concatenation and mean-pooling of the fused outputs. This embedding robustly encodes both movement patterns and spatial context while maintaining linear complexity in trajectory length.

2. Travel Purpose-aware Pre-training

To enrich embeddings with travel purpose semantics without affecting inference cost, TrajMamba introduces a two-stage pre-training strategy:

Textual pre-training branches:
- For each trajectory, road segments and surrounding POIs are embedded using a shared pre-trained textual embedding model, applied to raw textual attributes (e.g., names, POI descriptions).
- Embeddings are contextually enriched through local aggregation and global context using learnable aggregation functions.
- $\hat{z}_{e_i} = z_{e_i} + \text{Agg}^{\text{Road}}(\{z_{e_j}\}, z_{e_1}, z_{e_n})$
- $\hat{z}_{p_i} = z_{p_i} + \text{Agg}^{\text{POI}}(\{z_{p_j}\}, z_{p_1}, z_{p_n}) + E_{\text{Pid}}(p_i)$
Semantic view extraction:

Each road and POI view is summarized via dedicated Mamba blocks and mean pooling, yielding compressed views $z^{\text{Road}}$ and $z^{\text{POI}}$ that encode the trip’s geographic and functional semantics.

Contrastive InfoNCE-based alignment:

The main trajectory embedding $z_\mathcal{T}$ is aligned with both textual views using an InfoNCE loss (with learnable temperature $T$ ). This ensures that $z_\mathcal{T}$ implicitly encodes the trip’s underlying purpose. -

$\mathcal{L}_{\text{InfoNCE}} = -\frac{1}{N} \sum_{i=1}^N \log \frac{\exp(\text{sim}(z_\mathcal{T}^i, z_{\text{view}}^i)/T)}{\sum_{j=1}^N \exp(\text{sim}(z_\mathcal{T}^i, z_{\text{view}}^j)/T)}$

Crucially, the text-based branches are only employed during pre-training. At inference, only the highly efficient TrajMamba encoder is used, incurring no extra cost for semantic integration.

3. Knowledge Distillation and Trajectory Compression

TrajMamba incorporates a knowledge distillation scheme to both identify key points in trajectories and compress them for fast, high-quality embedding:

Rule-based preprocessing:

Candidate redundant or non-informative trajectory points (e.g., during vehicle idleness or constant-velocity travel) are pruned from the raw input.

Learnable mask generation:

For the remaining points, a soft mask vector $m$ is learned via $m_i = g(\mu_i) = \max(0, \min(1, \mu_i + \epsilon))$ where $\mu$ are learnable parameters and $\epsilon$ is Gaussian noise during training to enforce robustness and sparsity.

Compressed trajectory embedding:

The masked trajectory is passed through a new TrajMamba encoder (initialized from a travel purpose pre-trained teacher).

Multi-view entropy coding (MEC) loss with mask penalty:

The distillation loss combines a MEC loss (aligning compressed and teacher embeddings) and a penalty that encourages sparsity in the mask:

$\mathcal{L} = \frac{1}{2} (\mathcal{L}_{\text{MEC}} + \mathcal{L}_{\text{mask}})$

This method ensures that only the most semantically discriminative trajectory points are retained, resulting in smaller, more informative embeddings (Liu et al., 20 Oct 2025).

4. Experimental Evaluation

Evaluation was conducted on two large-scale real-world taxi trajectory datasets (Chengdu and Xian) and three key downstream tasks:

Destination Prediction (DP):

TrajMamba predicts both GPS coordinates and road segment endpoints from truncated trajectories. The method reduces GPS coordinate errors by up to 45% (Chengdu) and 26% (Xian) compared to the leading baseline JGRM. Road segment prediction accuracy gains are 9–10% over the same baseline.

Arrival Time Estimation (ATE):

TrajMamba achieves the lowest mean absolute error (MAE) and mean percentage error (MAPE) of all compared methods.

Similar Trajectory Search (STS):

Using cosine similarity of embeddings, TrajMamba achieves the highest Acc@1/Acc@5 and the lowest mean rank, indicating more meaningful trajectory representation.

The encoder’s computational efficiency is underscored by substantial reductions in embedding time and model size relative to Transformer-based models, readily supporting real-time deployment.

Task	Improvement over JGRM	Notes
Destination Prediction	45% (Chengdu, GPS error), 9–10% (road seg. acc.)	Lower error, higher acc.
Arrival Time Estimate	Lowest MAE, MAPE	Outperforms all baselines
Trajectory Search	Highest Acc@1/5	Best mean rank

5. Comparative Context and Distinctive Features

Within the landscape of trajectory representation learning and semantic trajectory analysis—spanning approaches such as RNNs, Transformers, trajectory2vec, and road-based contrastive learning—TrajMamba introduces several distinctive advances:

SSM-based encoding:

The dual-branch Traj-Mamba SSM model supports linear time complexity, outperforming Transformer-based encoders on both accuracy and scalability.

Travel purpose fusion:

The pre-training regimen achieves integration of textual travel purpose semantics without incurring inference-time cost, in contrast to models requiring heavy language modeling branches during prediction.

Automated compression:

The learnable mask-based compression, guided by a knowledge distillation teacher, systematically reduces redundancy in dense, real-world GPS trajectories, which directly improves both computational efficiency and representation quality.

Generalization and Transferability:

The resulting embeddings retain strong transferability across tasks, as shown by high performance in prediction, estimation, and search scenarios.

6. Applications and Implications

The design and empirical performance of TrajMamba have direct consequences for a gamut of intelligent transportation and urban mobility systems:

Ride-hailing and mobility-on-demand:

Efficiently predicting destinations or travel times from partial trip data allows for optimized dispatch and dynamic pricing.

Urban planning and analytics:

Rich travel purpose-aware embeddings support land-use inference, infrastructure design, and behavioral analysis.

Real-time anomaly detection:

Compact, semantically meaningful trajectory representations facilitate large-scale streaming analysis for fraud, safety, or congestion detection.

Flexible transfer to new tasks:

The approach is amenable to trajectory clustering, next-location recommendation, and new forms of multi-modal spatio-temporal querying.

A plausible implication is that the paradigm set by TrajMamba—dual perspective modeling, semantic-centric pre-training decoupled from inference, and end-to-end compression—may guide future architectures in trajectory intelligence, particularly as application scales and semantic complexity increase.

7. Summary Table: TrajMamba Workflow Components

Component	Role	Computational Impact
Traj-Mamba Encoder	Dual GPS/road SSM feature aggregation	Linear in trajectory size
Travel Purpose Pre-training	Embeds semantics via contrastive infoNCE	Training-only (no inference cost)
Mask-based Compression	Redundant point removal/feature distillation	Lowered embedding time
Downstream Task Inference	Predicts endpoints, times, matches	Accelerated, accurate

In summary, TrajMamba exemplifies an efficient, scalable, and semantically enriched trajectory analysis framework, validated by empirical results and distinct architectural design enabling practical deployment in large-scale intelligent mobility contexts (Liu et al., 20 Oct 2025).

PDF Markdown Chat (Pro)

References (1)

TrajMamba: An Efficient and Semantic-rich Vehicle Trajectory Pre-training Model (2025)

TrajMamba: Efficient Trajectory Learning

1. Traj-Mamba Encoder Architecture

2. Travel Purpose-aware Pre-training

3. Knowledge Distillation and Trajectory Compression

4. Experimental Evaluation

5. Comparative Context and Distinctive Features

6. Applications and Implications

7. Summary Table: TrajMamba Workflow Components

Whiteboard

Follow Topic

Continue Learning

TrajMamba: Efficient Trajectory Learning

1. Traj-Mamba Encoder Architecture

2. Travel Purpose-aware Pre-training

3. Knowledge Distillation and Trajectory Compression

4. Experimental Evaluation

5. Comparative Context and Distinctive Features

6. Applications and Implications

7. Summary Table: TrajMamba Workflow Components

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics