Auxiliary Forecasting Model Card
- Auxiliary Forecasting Model Card is a structured framework that clearly documents design choices like window length, parameter sharing, and exogenous variable handling.
- It standardizes reporting across key dimensions such as model configuration, preprocessing, and temporal/spatial processing to facilitate fair cross-model benchmarking.
- It promotes rigorous empirical practices by disclosing methodological details, computational complexity, and best practices to enhance reproducibility in forecasting research.
An Auxiliary Forecasting Model Card is a structured documentation framework that explicates essential design choices and methodological dimensions underlying time series forecasting models. Developed to address pervasive ambiguities in empirical research and benchmarking, the card enforces transparency regarding architectural, preprocessing, and training assumptions that critically influence both the performance and the reproducibility of forecasting systems (Moretti et al., 27 Dec 2025). It serves to standardize reporting, enable fair cross-model comparison, and guide rigorous empirical practice in the time series forecasting community.
1. Definition and Rationale
The Auxiliary Forecasting Model Card is specifically tailored to the time-series forecasting context, in contrast to generic machine learning model cards. Its central purpose is to make explicit the often “hidden” design choices—ranging from parameter sharing regime (“local” vs “global”) to look-back window length and handling of exogenous variables—which have disproportionate impact on forecast accuracy, scalability, and inductive capabilities. By standardizing vocabulary and structure, the card seeks to eliminate misleading benchmarking practices and clarify the true origins of observed empirical gains (Moretti et al., 27 Dec 2025).
2. Card Structure: Dimensions and Fields
The model card comprises five primary sections, each with critical sub-dimensions:
- Model Setting:
- Window length (): Number of historical steps used as input. Impacts ability to capture seasonality versus computational cost, e.g., for attention-based models.
- Transductive vs. Inductive: Whether the architecture supports deployment to unseen series (“cold start”) using global parameter sharing, or is retrained/tailored per series.
- Missing-data handling: Masking or imputation method for data gaps; directly affects robustness to real-world data artifacts.
- D1: Model Configuration:
- Paradigm: Declares the parameter-sharing regime:
- Local: per-series parameters
- Global: shared parameters
- Hybrid: shared backbone and per-series augmentation
- Shared vs Non-shared architectural elements: Clarifies, for example, whether normalization layers or embeddings are local or global.
- Paradigm: Declares the parameter-sharing regime:
- D2: Preprocessing and Exogenous Variables:
- Scaling/normalization strategy: e.g., per-series -score, Reversible Instance Norm.
- Detrending/decomposition: Use of moving average, de-trending, or seasonal decomposition.
- Exogenous variable inclusion: List and treatment of time-of-day, calendar, weather, or lagged features.
- D3: Temporal Processing:
- Core sequence operator: Enumerates whether MLP, GRU/LSTM, TCN, or self-attention modules constitute the main temporal mechanism.
- Complexity with respect to : Scalability profile, i.e., for convolutions, for attention.
- Temporal locality vs. globality: Specifies if long-range dependencies are supported.
- D4: Spatial Processing (for multivariate/multi-sensor models):
- Spatial/graph structure and operator: Details on use of GCNs, spatial attention, or graph convolutions.
- Complexity scaling with series count .
Example fields, notation, and rationales are provided in (Moretti et al., 27 Dec 2025).
3. Mathematical Formalisms and Complexity
The model card mandates explicit documentation of mathematical formulations:
- Local/global/hybrid inference notation:
- Global:
- Local:
- Hybrid:
- Explicit expressions for core operators (e.g., dilated convolution, attention maps) and computational complexity.
- Declaration of loss/metric formulation, e.g., and as primary criteria.
This level of mathematical transparency is intended to support precise reproducibility, clarify the impact of architectural choices, and facilitate analysis of computational feasibility.
4. Use Cases, Best Practices, and Benchmarking Alignment
The card standardizes practical reporting, including templates for popular model families—classical GRU RNNs, patching-based transformers (PatchTST), and spatial TCNs. Examples define, for instance, data windowing choices, paradigm types, preprocessing modalities, and main operators. For benchmarking:
- All models must share the same model setting (window, preprocessing, exogenous inclusion) to ensure only core temporal/spatial processing differences are compared.
- “Hidden hybrids,” where e.g. per-series normalization parameters are present, must be declared to avoid false “global” model claims.
The card also requires reporting of computational metrics (batch runtime, memory footprint), enabling assessment of scalability and production viability.
5. Limitations, Common Pitfalls, and Recommendations
While the Auxiliary Forecasting Model Card raises the standard of empirical reporting, common failure modes persist if dimensions are omitted or misdeclared:
- Neglecting to state inductive/transductive status (limits real-world applicability).
- Mismatching preprocessing/covariates across model comparisons—can produce spurious performance gaps.
- Omitting complexities—e.g., an attention mechanism—that undermine deployment feasibility.
Recommended practices include:
- Maintaining and publishing model cards alongside every benchmark or codebase, updating as design choices evolve.
- Using synthetic or control datasets to specifically isolate the impact of single model card dimensions (e.g., temporal vs. spatial operator).
- Using the card to structure and constrain hyperparameter searches, enabling interpretable empirical ablation.
A shift to discipline-wide adoption is positioned as a prerequisite for unambiguous attribution of advances and more reproducible, interpretable research.
6. Broader Impact and Adoption
By enforcing explicit declaration of architectural and preprocessing choices, the Auxiliary Forecasting Model Card directly targets the source of contradictory empirical results and “patchwork” reporting in the time series literature. Adoption is anticipated to support:
- Improved reproducibility.
- Transparent benchmarking.
- More interpretable attributions of performance gains to concrete design elements rather than incidental, unshared factors.
This structured approach is designed to be both lightweight (documented as a concise companion to papers and code) and powerful in its ability to standardize practice across both academic research and real-world production settings (Moretti et al., 27 Dec 2025).