Predicting Long-Term Student Outcomes from Short-Term EdTech Log Data

July 2, 2025

Ge Gao, Amelia Leon, Andrea Jetten, Jasmine Turner, Husni Almoubayyed, Stephen Fancsali & Emma Brunskill

221120_WCH_UG_Bidibidi_CWTL_32_JC_lpr
Educational stakeholders are often particularly interested in sparse, delayed student outcomes, like end-of-year statewide exams. The rare occurrence of such assessments makes it harder to identify students likely to fail such assessments, as well as making it slow for researchers and educators to be able to assess the effectiveness of particular educational tools. Prior work has primarily focused on using logs from students full usage (e.g. year-long) of an educational product to predict outcomes, or considered predictive accuracy using a few minutes to predict outcomes after a short (e.g. 1 hour) session. In contrast, this study investigates machine learning predictors using students’ logs during their first few hours of usage can provide useful predictive insight into those students’ end-of-school year external assessment.