Model for the identification of students at risk of dropout using big data analytics

In the school context, one of the main metrics for institution performance is the student’s dropout rate. The decrease of the number of students in a university implies a reduction of the main resources necessary for its operation, but the difficulty of this problem is that we need to identify early as possible the students that are at risk of dropout, in order to adopt measures before they give up. This work proposes a model for the early identification of students at dropout risk, extracting weekly the academic data generated by the university and applying machine learning techniques with the aim of producing a classification of dropout. We use as a case study the Instituto Politécnico de Bragança from Portugal, which provided data of three different datasets refers to the years 2009 to 2017, resulting in 200 million records. The results indicate that the proposed model is a good option to early identify students at risk of dropout, based on the critical_rate attribute created it is possible to generate a ranking of necessity, allowing institutions to target their resources in a critical order, minimizing their expenses and the errors of the model itself.

Model for the identification of students at risk of dropout using big data analytics Artigo de Conferência