The Retraction Epidemic in Science Across Publishers, Fields, and Countries

Published 2 Apr 2026 in physics.soc-ph | (2604.02302v1)

Abstract: Retractions serve as an indicator of failures in research integrity, yet most analyses focus on absolute counts rather than risk per paper. We use one of the largest open bibliographic databases to develop incidence metrics normalized by population: retractions per publication and per active author annually. Applying an epidemiological framework that models counts with exposure, we find evidence of exponential growth in retraction incidence, with approximately a 5-year doubling time at both the paper and author levels. These patterns vary significantly across fields, publishers, and countries. While scientific output is becoming more democratized globally, retractions are concentrated in fewer countries, creating a "concentration" paradox that calls for targeted monitoring. Despite exponential growth, the absolute incidence remains low (0.12% in 2021), allowing for corrective intervention. Incidence-based monitoring provides a framework for evaluating policies that safeguard research integrity at scale.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces an incidence-based model that quantifies retraction risk using exposure-adjusted GLMs.
It finds exponential growth with system-wide doubling times of 5–6 years, varying notably across disciplines and regions.
It reveals a concentration paradox where retractions are increasingly clustered in a few publishers and high-risk countries like China and India.

Incidence-Based Characterization of the Retraction Epidemic

Epidemiological Framework and Exponential Dynamics

"The Retraction Epidemic in Science Across Publishers, Fields, and Countries" (2604.02302) deploys an incidence-driven epidemiological framework leveraging the OpenAlex bibliometric dataset (October 2025 snapshot) to quantify research-integrity breaches via retractions while correcting for confounding publication volume and author count growth. By operationalizing retraction risk as incidence—retractions per published paper and per active author—rather than raw counts, the authors rigorously disentangle the multiplicative expansion of scientific literature from genuine shifts in integrity risk. Exposure-adjusted GLMs fitted with negative binomial likelihoods overwhelmingly support an exponential-growth model for incidence at both paper and author levels, with system-wide doubling times of approximately 5–6 years.

Figure 1: Global and temporal patterns show exponential growth of publication and retraction counts, with incidence increasing multiplicatively—GLM fits reveal a robust 5.6-year doubling time.

The exponential growth of per-paper and per-author retraction incidence consolidates across domains, fields, publishers, and countries, indicating a pervasive multiplicative stress on the integrity control infrastructure. The paper’s statistical selection procedures (BIC-based model weights) consistently favor exponential incidence trajectories over linear or constant alternatives, with negligible support for additive mechanisms.

Figure 2: Author incidence of retractions mirrors paper-level trends, confirming system-wide exponential dynamics and a convergent doubling time.

Cross-Disciplinary and Geographic Heterogeneity

Incidence trajectories are not uniform across the scientific ecosystem. Field-level analyses reveal extreme heterogeneity: Computer Science, Engineering, and Chemistry exhibit the shortest doubling times (≈2.8–5 years), while Medicine and Immunology manifest slower but steady increases (>10 years). Notably, fast-growing fields do not necessarily have high absolute retraction volumes. For domains, Physical Sciences and Social Sciences show doubling times in the 3.8–5.1 year range, whereas Health Sciences are more stable (≈10.5 years).

Figure 3: Domain-level retraction incidence curves, with exponential GLM fits, demonstrate marked variations in growth rates and doubling times across major scientific branches.

Regional analysis uncovers pronounced concentration, with China and India exhibiting the highest per-capita retraction incidence—4.5× the global average—and the fastest doubling times. The United States shows evidence of course correction, manifesting flat or declining incidence in recent cohorts, likely attributable to policy reform and tightened editorial screening.

Figure 4: Country-level exponential growth curves for retraction incidence indicate marked discrepancies in doubling times, with China, India, and Russia leading the increase.

Publisher-level analysis demonstrates both exponentially increasing and stabilized trajectories. Taylor & Francis, Springer, and Hindawi show rapid multiplicative increases, whereas Nature Portfolio and Oxford University Press display flat or declining incidence, implying that publisher-level monitoring and targeted policy interventions are feasible and effective.

Concentration Paradox and Inequality Analysis

Despite democratization of output—more countries contributing significant share of publications over time—retractions are increasingly concentrated within fewer countries and publishers, illustrating a stark concentration paradox. Gini coefficient analysis confirms this divergence: while publication Gini drops from 0.92 to 0.88 (greater equality), retraction Gini surges from 0.55 to 0.90 (greater concentration).

Figure 5: Relative incidence scatter and Gini trajectory reveal concentration paradox: democratization of publication, but intensified retraction clustering in fewer entities.

Lorenz curves for both countries and publishers further quantify this effect, highlighting that 2–3 publishers account for 50% of global retractions by 2021, and fewer than 10 countries account for 90%. This intensity is even more pronounced at the publisher level.

Figure 6: Publisher-level Lorenz curves show that retraction incidence is extremely concentrated, with dramatic growth in inequality over time.

Robustness, Limitations, and Policy Implications

Sensitivity analyses confirm robustness of the exponential-growth model across time windows and overdispersion scenarios. The incidence-based approach corrects for right-censoring by restricting analyses to pre-2021 publication cohorts (median lag between publication and retraction is now ≈2 years), reducing bias in growth-rate estimates. However, the paper acknowledges intrinsic limitations: inability to directly distinguish increasing misconduct from improved detection, inconsistent retraction practices across fields and publishers, and database coverage gaps (especially in Computer Science).

Practically, the low absolute incidence (0.12% in 2021) renders corrective action tenable, but the concentration paradox necessitates targeted surveillance and intervention. Uniform policy approaches are unlikely to address integrity stress in the most affected sectors; publisher- and country-specific remediation is critical. The Hindawi case underscores the capacity for workflow policy changes to disproportionately alter incidence trajectories.

Theoretical and Practical Implications for Future AI and Science

From a theoretical perspective, the system-wide exponential growth in integrity failures challenges additive risk models, requiring the adoption of multiplicative, exposure-adjusted frameworks for integrity surveillance. The strong coupling between author and paper incidence trends points toward broad systemic pressures, potentially diffusing risk via collaborative networks, paper-mills, or career incentives.

Practically, small improvements in growth-rate reduction ( $g$ ) substantially bend the long-term incidence curve, given exponential behavior. Incidence-based metrics provide a scalable evaluation paradigm for policy effectiveness, detection improvement, and editorial interventions. As publication output continues to accelerate—driven in part by automated and AI-assisted science—the integrity monitoring framework provided herein becomes vital for maintaining trust in scientific communication.

Conclusion

This study delivers rigorous, exposure-adjusted epidemiological quantification of science's retraction epidemic, demonstrating exponential growth in incidence across papers and authors and intense concentration in high-risk countries and publishers. The multiplicative nature of risk, coupled with the concentration paradox, mandates incidence-based, targeted monitoring and policy remediation. The robustness and granularity of the methodology establish a foundational framework for scientific integrity surveillance in an era of accelerating—and increasingly democratized—global output.

Markdown Report Issue