Main Article Content
Predicting School Dropout Using Machine Learning Models: A Case Study of Nyanza District, Rwanda
Abstract
This study investigates the application of machine learning models to predict school dropouts in Nyanza District, Rwanda, addressing the challenge of early identification of at-risk students. By adopting a classification-based approach, the research analyzes data from parents or guardians, school instructors, and teachers to pinpoint contributing factors such as socioeconomic conditions, academic performance, and family background. The research explores a range of machine learning models, including Logistic Regression, Decision Tree Classifier, Gradient Boosting Regression, Artificial Neural Networks (ANN), K-Nearest Neighbors (KNN), and Naive Bayes. These models are evaluated using metrics like accuracy, recall, precision, F1 score, and ROC-AUC, with an emphasis on balancing recall (identifying at- risk students) and precision. The study reveals that different models offer varying levels of performance. KNN achieves a notable accuracy of 0.72 and an exceptional recall of 0.91, successfully identifying 91% of at-risk students. Naive Bayes, however, is highlighted as the most well rounded model, balancing precision and recall effectively. This research fills the gap in predictive analytics for dropout prevention in Nyanza District and offers actionable insights for educators and policymakers to enhance student retention through targeted interventions.