Papers
Topics
Authors
Recent
2000 character limit reached

A Leakage-Aware Data Layer For Student Analytics: The Capire Framework For Multilevel Trajectory Modeling (2511.11866v1)

Published 14 Nov 2025 in cs.CY

Abstract: Predictive models for student dropout, while often accurate, frequently rely on opportunistic feature sets and suffer from undocumented data leakage, limiting their explanatory power and institutional usefulness. This paper introduces a leakage-aware data layer for student trajectory analytics, which serves as the methodological foundation for the CAPIRE framework for multilevel modelling. We propose a feature engineering design that organizes predictors into four levels: N1 (personal and socio-economic attributes), N2 (entry moment and academic history), N3 (curricular friction and performance), and N4 (institutional and macro-context variables)As a core component, we formalize the Value of Observation Time (VOT) as a critical design parameter that rigorously separates observation windows from outcome horizons, preventing data leakage by construction. An illustrative application in a long-cycle engineering program (1,343 students, ~57% dropout) demonstrates that VOT-restricted multilevel features support robust archetype discovery. A UMAP + DBSCAN pipeline uncovers 13 trajectory archetypes, including profiles of "early structural crisis," "sustained friction," and "hidden vulnerability" (low friction but high dropout). Bootstrap and permutation tests confirm these archetypes are statistically robust and temporally stable. We argue that this approach transforms feature engineering from a technical step into a central methodological artifact. This data layer serves as a disciplined bridge between retention theory, early-warning systems, and the future implementation of causal inference and agent-based modelling (ABM) within the CAPIRE program.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.