Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 147 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 23 tok/s Pro
GPT-4o 76 tok/s Pro
Kimi K2 202 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

VisitHGNN: Urban Mobility Graph Learning

Updated 7 October 2025
  • VisitHGNN is a heterogeneous graph neural network designed to predict neighborhood-to-destination visit probabilities using multimodal urban data.
  • It integrates spatial, temporal, functional, and socio-demographic features through relation-specific message passing, capturing complex urban dynamics.
  • Real-world evaluations in Fulton County, GA demonstrate its superior performance over baselines, with metrics like a KL divergence of 0.287 and a top-1 accuracy of 0.853.

VisitHGNN is a heterogeneous, relation-specific graph neural network designed to model neighborhood-to-destination visit probabilities in urban settings. It operates over a multimodal graph that integrates spatial, temporal, functional, and socio-demographic inputs, predicting the distribution of visits from census block groups (CBGs) to individual points of interest (POIs). By combining multi-format node features, heterogeneous edge types, and a masked Kullback–Leibler (KL) divergence loss, VisitHGNN achieves high-fidelity estimation of empirical visit patterns—demonstrated on real-world mobility data from Fulton County, Georgia—with superior accuracy and alignment compared to strong baseline models (Pang et al., 3 Oct 2025).

1. Heterogeneous Graph Construction and Relation Encoding

The VisitHGNN graph is constructed with explicit heterogeneity in both node and edge types. The node set comprises:

  • Points of Interest (POIs): Urban venues characterized by numerical attributes (e.g., polygon area, raw visit counts), structured JSON-derived features (e.g., weekday opening hours, bucketed dwell times), and text-derived embeddings (e.g., names and categories embedded via BERT).
  • Census Block Groups (CBGs): Geographic regions described by 72 socio-demographic features (e.g., population, education, commute modes) augmented by spatial centroid coordinates.

The edge set is defined as follows:

  • POI–POI Edges: These encode three complementary urban relationships:
    • Geospatial proximity: Each POI is linked to its KK nearest neighbors using the great-circle (haversine) distance and directional bearing.
    • Temporal similarity: Edges are established between POIs with similar hourly visitation profiles, evaluated via cosine similarity over L1-normalized 168-hour activity vectors.
    • Functional (brand) affinity: Edges connect venues with shared brand identity, empirically co-visited venues, and similar categorical labels to capture demand substitutability.
  • CBG–CBG Edges: Represent spatial contiguity among neighboring CBGs.
  • POI–CBG Edges: Two types:
    • Belong edge: Links each POI to its administrative CBG.
    • Spatial KNN edge: For each POI, connects to the KK closest CBGs (candidate origins), annotated by Euclidean centroid distance.

All edges can carry vector or scalar attributes (e.g., distance) and are processed by relation-specific message-passing layers to preserve their semantics.

2. Node and Edge Feature Encoding, Multimodal Fusion

POI encoders process input via multiple parallel modules:

  • Numerical Encoder: Processes polygon areas, visit counts, visitor counts, median dwell times, and spatial distances through a multilayer perceptron (MLP).
  • JSON-derived Feature Encoder: Extracts and aggregates structured features like daily open/close times and normalized dwell-time histograms.
  • Text Encoder: Concatenates names, brands, and categorical labels, then encodes using a pre-trained BERT-based LLM to obtain a dense vector.

CBG encoders implement a two-layer GraphSAGE architecture, jointly consuming 72 demographic variables and centroid coordinates, using message passing over CBG–CBG adjacency.

Relation-specific message passing is achieved using GATv2 kernels for each POI–POI edge type, with each relation receiving an independent transformation. Outputs of all relations are merged through a gated residual mechanism, ensuring that non-redundant relational signals propagate into the final embedding. For POI–CBG fusion, GraphSAGE is applied to “belong” edges, and GATv2 is used on spatial KNN edges to facilitate information exchange across heterogeneous node types.

All embeddings are normalized via GraphNorm and projected (via MLPs or linear layers) into a shared latent space of dimension dhidd_{\text{hid}}.

3. Candidate Set Inference, Scoring, and Probability Calibration

For each POI pp, VisitHGNN defines a set of KK candidate origin CBGs, NK(p)N_K(p), as the nearest CBGs in Euclidean space. For each candidate ckNK(p)c_k \in N_K(p), the POI and CBG embeddings are concatenated to form pairwise feature vectors.

A shared MLP prediction head with dropout and ReLU activations generates logits: sp,k=MLP([hPOI(p);hCBG(ck)])s_{p,k} = \mathrm{MLP}([h_\text{POI}(p);\, h_\text{CBG}(c_k)]) These logits are passed through a masked softmax to obtain a proper probability distribution over the candidate set: pp,k=exp(sp,k)mp,kj=1Kexp(sp,j)mp,jp_{p,k} = \frac{\exp(s_{p,k}) \cdot m_{p,k}}{\sum_{j=1}^K \exp(s_{p,j}) \cdot m_{p,j}} where mp,k{0,1}m_{p,k} \in \{0,1\} masks invalid candidate slots.

4. Training Objective and Loss Function

Supervision is imposed only over legitimate POI–CBG candidate pairs for which visit data exists. The ground-truth distribution yp,ky_{p,k} for POI pp (fractional share of observed visits from CBG ckc_k) is estimated from aggregated OD (origin-destination) flows. The model is optimized via masked Kullback–Leibler (KL) divergence: L=1PtrpPtrk=1Kyp,klogyp,k+ϵpp,k+ϵmp,k\mathcal{L} = \frac{1}{|\mathcal{P}_{\text{tr}}|}\sum_{p \in \mathcal{P}_{\text{tr}}}\sum_{k=1}^{K} y_{p,k} \log\frac{y_{p,k} + \epsilon}{p_{p,k} + \epsilon} \cdot m_{p,k} where ϵ\epsilon (e.g., 10910^{-9}) provides numerical stability. This objective ensures full normalization within the candidate set and enforces distributional calibration rather than only pointwise or ranking accuracy.

5. Model Performance and Evaluation Metrics

VisitHGNN demonstrates high predictive fidelity on large-scale weekly mobility data from Fulton County, GA. Performance metrics include:

  • Mean KL divergence: $0.287$
  • Mean Absolute Error (MAE): $0.008$
  • Top-1 accuracy: $0.853$ (probability mass on the most likely predicted origin equals that of observed)
  • Coefficient of determination (R2R^2): $0.892$
  • NDCG@50 (Normalized Discounted Cumulative Gain): $0.966$
  • Recall@5: $0.611$

All metrics are averaged over POIs with test observations. These scores substantially surpass those of pairwise MLP and distance-only baselines, indicating that VisitHGNN captures the underlying structure of urban mobility with high fidelity. NDCG@50 in particular demonstrates effective probability ranking, while high top-1 accuracy and R2R^2 reflect prediction sharpness and explained variance.

6. Urban Analytics Applications and Domain Implications

The capacity of VisitHGNN to recover accurate, well-calibrated origin distributions for arbitrary POIs has several direct implications:

  • Urban Planning and Land Use: Origin probability maps by POI can inform siting, accessibility analyses, intra-urban equity studies, and investment prioritization by clarifying neighborhood-level contributions to specific destinations.
  • Transportation Policy and Multimodal Design: Accurate origin distributions enable better demand modeling for public transit routing, mobility hub placement, and vehicular/pedestrian flow forecasting.
  • Public Health: Fine-grained estimation of neighborhood contributions to venues allows for targeted exposure/risk assessments, resource allocation, and tailored interventions in epidemiological planning.

A plausible implication is that such high-fidelity, distributional models can directly support scenario analysis, accessibility policy, and real-time mobility system management.

7. Position within the Graph Learning Landscape

VisitHGNN exemplifies an end-to-end, relation-specific heterogeneous graph neural network leveraging multi-modal feature fusion, attention across multiple edge types, and fully distributional calibration. Its approach aligns with modern trends in urban- and location-based AI: moving from descriptive analytics to precise, actionable probabilistic inference. Its performance evidences the value of integrating heterogeneous node and edge features, explicit spatial–temporal–functional relations, and robust calibration objectives within urban graph analytics.

This model situates itself among state-of-the-art techniques for spatial mobility modeling, advancing beyond simple pairwise distance- or attribute-based approaches by employing domain-aligned heterogeneous GNN principles and rigorously validated probabilistic outputs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to VisitHGNN.