Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy

Time series forecasting and analysis are helpful in the decision-making process. However, exogenous factors, nonlinearities, and seasonality make developing efficient forecasting models challenging. In this context, the use of machine learning approaches, especially ensemble learning models, becomes...

ver descrição completa

Autor principal: Ribeiro, Matheus Henrique Dal Molin
Formato: Tese
Idioma: Inglês
Publicado em: Pontifícia Universidade Católica do Paraná 2022
Assuntos:
Acesso em linha: http://repositorio.utfpr.edu.br/jspui/handle/1/28866
Tags: Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
Resumo: Time series forecasting and analysis are helpful in the decision-making process. However, exogenous factors, nonlinearities, and seasonality make developing efficient forecasting models challenging. In this context, the use of machine learning approaches, especially ensemble learning models, becomes attractive. These approaches are based on the divide and conquer paradigm, where models are combined to generate an efficient model to solve regression (or time series forecasting), clustering, and classification tasks. This thesis aims to propose and evaluate the effectiveness of approaches based on ensemble learning models for time series forecasting problems. Four major contributions are presented in this thesis. The first one lies in applying and comparing bagging, boosting, and stacking ensemble learning methods with single models for the short-term forecasting of soybean and wheat prices for Parana state, Brazil. Also, the influence of exogenous variables to forecast these agricultural commodities is investigated. The second evaluates ensemble learning methods that consider the ensemble empirical mode decomposition, heterogeneous ensemble learning models, and multi-objective optimization to forecast the multi-stepahead monthly incidence of meningitis cases in four Brazilian states. In the third, two time series pre-processing techniques are integrated with heterogeneous ensemble learning models to perform electricity energy load forecasting. Finally, a competitive ensemble model that combines bagging and stacking ensemble learning models is proposed for multi-step wind power generation forecasting. The forecasting performance of the proposed models is computed through different criteria measures such as mean absolute error, mean squared error, mean absolute percentage error, relative mean squared error, symmetrical mean absolute percentage error, and relative mean squared error. Also, the Diebold-Mariano test is adopted to evaluate the statistical difference between the forecasting errors of the compared models. With the results achieved in the case studies, it is possible to identify that the ensemble learning methods can reach lower forecasting errors when compared with single models, non-decomposed, non-optimized, and decomposed homogeneous ensemble learning models. The ensemble learning methods achieved a forecasting error lower than 1% for the first application concerning the percentage errors. To forecast meningitis cases, the proposed ensemble learning has an error that spans 4.52%–12.77%. For the third application, the proposed approach reached errors between 0.80% and 7.79%. Finally, concerning the fourth application, the cooperative and competitive ensemble learning model has errors that span 13.81%–21.20%. Therefore, the results indicate that using ensemble learning methods is promising to improve forecasting accuracy. Also, the models proposed in this thesis can be adapted for other areas and used to help the decision-making process.