Citation Age Profiles
- Citation Age Profiles are quantitative measures that capture the time gap between citing and cited works, revealing trends in obsolescence and scholarly impact.
- They are used in bibliometrics to assess reference recency, highlight shifts in research fronts, and identify field-specific citation patterns.
- Models employ decay functions and metrics like mAoC and old fraction to differentiate between demographic growth effects and genuine citation aging.
Citation age profiles quantitatively characterize the temporal distribution of references or citations with respect to the age of the cited work. These profiles are central in bibliometrics, research evaluation, and modeling of scholarly communication, providing insights into the dynamics of knowledge usage, research front movement, and the impact or obsolescence patterns of scientific literature.
1. Definitions and Formalisms
Citation age, in its canonical form, is defined as the difference between the publication year of the citing document and the publication year of the cited document: where is the citing paper and is the cited work. For a paper citing references, the mean age of citations (mAoC) is: Aggregated over a corpus or research field, citation age profiles can be represented as time series of or as the proportion of citations to works exceeding a given age threshold (e.g., years). Another frequently used measure is the "old fraction" , denoting the share of references older than years (Verstak et al., 2014, Wahle et al., 19 Feb 2024, Milojević, 2012).
2. Empirical Trends and Field-Specific Patterns
Large-scale studies reveal pronounced inter-field and epochal differences in citation age profiles. In the period 1990–2013, the global fraction of references to articles years old increased from 28.1% to 36.0% (+28%), and similar or greater relative increases were found for thresholds of 15 and 20 years (Verstak et al., 2014). However, recent analyses extending to 2023 indicate a reversal—termed “citation age recession”—with the proportion of older citations declining from previous peaks, most notably in rapidly expanding domains such as NLP and ML (–12.8% and –5.5% from maximum, respectively) (Wahle et al., 19 Feb 2024). In contrast, humanities and mathematics maintain higher mAoC (e.g., History 14.9 years, Mathematics 11.5 years), while computer science and AI subfields consistently display the youngest profiles (NLP 9.4 years) (Wahle et al., 19 Feb 2024, Nguyen et al., 7 Jan 2024).
The drivers of these trends include both structural features (e.g., rate of field growth, field "half-life," shifts in publication or retrieval technologies) and sociocultural pressures (publish-or-perish incentives, reviewer preferences for recency) (Wahle et al., 19 Feb 2024, Verstak et al., 2014). Accelerating declines in mean citation age, especially in fast-moving fields, are often attributed to the higher rate of landmark result production and dynamic knowledge turnover, not merely to citation “amnesia” (Nguyen et al., 7 Jan 2024).
3. Models of Citation Aging and Obsolescence
Citation age profiles are modeled as the outcome of combined preferential attachment and aging processes. In network models, the probability that a new paper cites an older one decays typically as a power-law or exponent of the paper’s age: induces obsolescence; empirical fits for citation networks yield , implying probability decays inversely with age (0908.2615). In hypergraph-based models (Hu et al., 2014, Hu et al., 2021), the citation likelihood decays as with field-calibrated exponents (APS: ; DBLP: ). These models reproduce observed citation distributions, accounting for both immediate bursts and long-tail obsolescence.
Alternative models, such as the uniform citability model, posit no obsolescence (), resulting in predicted citation age distributions determined solely by literature growth dynamics: Here, recency bias emerges naturally from the simple demographic effect that newer articles are more numerous (Ghaffari et al., 2023).
Stochastic dynamic models incorporating branching (i.e., copying via reference chains) demonstrate that citation distributions reflect both direct and indirect (redirection) processes, with exponential or heavier-tailed decay dependent on whether the indirect process is subcritical or supercritical (Golosovsky et al., 2014).
4. Metrics and Analytical Approaches
Multiple metrics quantify citation age profiles:
- Older Fraction:
where is the count of citations in year to works published at least years prior (Verstak et al., 2014).
- Mean AoC (mAoC) and Old Fraction : Mean and proportion to references older than a specified threshold (Milojević, 2012, Wahle et al., 19 Feb 2024).
- Price Index (PI): Fraction of references in the most recent years (often ) (Milojević, 2012).
- Modified Price Index (MPI): Ratio of -year to $2k$-year recent citations, discounting foundational references (Milojević, 2012).
- Volume-Adjusted Average Citation Age (VACA): normalizes mean citation age to the (log of) publication volume to control for size effects (Wahle et al., 19 Feb 2024).
Time-series and field-stratified aggregation reveal baseline velocities for research fronts and expose meso-level heterogeneity by academic age, productivity, and collaboration (Milojević, 2012). Network-centric analyses demonstrate broad classes of age profiles, including early-peak, late-peak, multi-peak, and monotonic categories, each with distinctive citation trajectories and field/venue affinities (Chakraborty et al., 2015).
| Metric | Formula (abbreviated) | Interpretation |
|---|---|---|
| mAoC | Mean citation age of references | |
| PI | Share of very recent citations (cutting edge) | |
| O | Share of citations to foundational literature | |
| MPI | Speed of research front elimination of old refs |
5. Causes, Consequences, and Interpretation
Several studies emphasize that declines in mean citation age or old-fraction are not always indicators of citation “amnesia” or bias, but may reflect underlying dynamics such as exponential field expansion, frequent emergence of new “landmark” literature, and changes in discoverability (e.g., digitization, full-text search, relevance ranking) (Verstak et al., 2014, Nguyen et al., 7 Jan 2024). Conversely, “citation age recession” identified in recent decades across scientific fields highlights an increasing tilt toward recency even after adjusting for growth, with evidence that this is not solely a product of output volume (Wahle et al., 19 Feb 2024).
Key consequences of recency bias or age recession comprise the risk of rediscovery (failure to acknowledge foundational insights), narrowing of discourse scope, and potential decline in research robustness and reproducibility. The field-dependent “half-life” of literature, i.e., the characteristic time over which citation propensity decays by half, varies: in physics subfields, half-lives range from 3.0 to 3.8 years, with long “tails” for large-collaboration areas (Higham et al., 2017).
Empirical and model-based findings also show strong association between researchers’ productivity/collaboration and citation age: highly productive and collaborative authors consistently cite younger references and push the research front, regardless of seniority (Milojević, 2012).
6. Macro- and Meso-Level Heterogeneity
Macro-level citation age trajectories can obscure substantial within-field diversity. Computer science, for example, exhibits six distinct citation age profile types: early-peak, late-peak, multi-peak, monotonic increase, monotonic decrease, and other/uncategorizable, with venue and publication year effects. Conference publications more often follow early-peak or monotonic decrease patterns, while journal articles manifest late-peak or monotonic increase profiles (Chakraborty et al., 2015).
Meso-level analysis further reveals that academic age alone has little effect on reference age, but high collaboration and productivity are robust predictors of citing more recent work and employing more diverse references (Milojević, 2012). These findings challenge simplistic models linking recency exclusively to career stage.
7. Measurement, Modeling, and Methodological Considerations
Best practices for constructing citation age profiles start with precise extraction of publication years for cited and citing works, followed by computation of paper-level and field-level summary statistics (mean, median, old-fraction). Temporal slicing enables tracking over time and comparison across fields or outlets. Robustness requires reporting both mean and median, use of auxiliary measures (e.g., >10 year old fraction), and stratification by key grouping variables (Nguyen et al., 7 Jan 2024, Milojević, 2012).
Modeling approaches should carefully disentangle demographic growth effects from genuine aging/obsolescence. The recent critique of recency bias based purely on reference age statistics argues that such patterns arise inevitably under uniform citability due to growth, and true obsolescence requires inference of age-dependent citability, not merely measurement of reference list composition (Ghaffari et al., 2023).
References
- (Verstak et al., 2014) "On the Shoulders of Giants: The Growing Impact of Older Articles"
- (Wahle et al., 19 Feb 2024) "Citation Amnesia: On The Recency Bias of NLP and Other Academic Fields"
- (Nguyen et al., 7 Jan 2024) "Is there really a Citation Age Bias in NLP?"
- (Milojević, 2012) "How are academic age, productivity and collaboration related to citing behavior of researchers?"
- (0908.2615) "Modeling scientific-citation patterns and other triangle-rich acyclic networks"
- (Hu et al., 2014) "Evolution of citation networks with the hypergraph formalism"
- (Hu et al., 2021) "The Aging Effect in Evolving Scientific Citation Networks"
- (Chakraborty et al., 2015) "On the categorization of scientific citation profiles in computer sciences"
- (Golosovsky et al., 2014) "Uncovering the dynamics of citations of scientific papers"
- (Ghaffari et al., 2023) "A model for reference list length of scholarly articles"
- (Higham et al., 2017) "Unraveling the dynamics of growth, aging and inflation for citations to scientific articles from specific research fields"