Tratamento temporal em mineração de dados educacionais para fidelização de estudantes

The creation of temporal attributes has proved important in many data mining problems in that the database is formed by data collected historically [Romero e Ventura 2007]. An example of this situation occurs in educational institutions, where the historical data of students – such as school perform...

ver descrição completa

Autor principal: Fazolin, Kleyton
Formato: Dissertação
Idioma: Português
Publicado em: Universidade Tecnológica Federal do Paraná 2017
Assuntos:
Acesso em linha: http://repositorio.utfpr.edu.br/jspui/handle/1/2883
Tags: Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
Resumo: The creation of temporal attributes has proved important in many data mining problems in that the database is formed by data collected historically [Romero e Ventura 2007]. An example of this situation occurs in educational institutions, where the historical data of students – such as school performance and financial situation – has been gradually acquired over time [Romero e Ventura 2007]. This paper presents a proposal for the creation of temporal attributes with the purpose of helping to predict the avoidance of elementary school students in private schools, treated as a classification problem. The loyalty and retention of students in educational institutions has become one of the greatest challenges for the management area of these institutions [Lin 2012]. A promising solution to achieve this goal is the use of educational data mining to identify patterns that aid in decision making. For the experiments, the data of 15,753 students of the Adventist Educational Network – one of the largest educational networks in the world [“Educação Adventista” 2016]– were employed. After the application of the classification algorithms, it was verified that the instance-based KNN classifier obtained the best accuracy before the use of the time attributes created, but the best algorithm to predict the avoidance in the context of this research was the Decision Tree J4.8 algorithm, because it allows the interpretation of the factors that led to the final result. The results show that the approach is feasible, obtaining an accuracy of up to 97.87% in the experiments performed and a gain of up to 14.39% in the accuracy when using the KNN with temporal attributes.