Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy

Time series forecasting and analysis are helpful in the decision-making process. However, exogenous factors, nonlinearities, and seasonality make developing efficient forecasting models challenging. In this context, the use of machine learning approaches, especially ensemble learning models, becomes...

ver descrição completa

Autor principal: Ribeiro, Matheus Henrique Dal Molin
Formato: Tese
Idioma: Inglês
Publicado em: Pontifícia Universidade Católica do Paraná 2022
Assuntos:
Acesso em linha: http://repositorio.utfpr.edu.br/jspui/handle/1/28866
Tags: Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
id riut-1-28866
recordtype dspace
spelling riut-1-288662022-06-22T06:06:15Z Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy Previsão de séries temporais com base em comitês de máquina aplicado ao agronegócio, epidemiologia, demanda de energia e energias renováveis Ribeiro, Matheus Henrique Dal Molin Coelho, Leandro dos Santos https://orcid.org/0000-0001-5728-943X http://lattes.cnpq.br/3483667901818921 Mariani, Viviana Cocco https://orcid.org/0000-0003-2490-4568 http://lattes.cnpq.br/1851884209044569 Coelho, Leandro dos Santos https://orcid.org/0000-0001-5728-943X http://lattes.cnpq.br/3483667901818921 Mariani, Viviana Cocco https://orcid.org/0000-0003-2490-4568 http://lattes.cnpq.br/1851884209044569 Nievola, Julio Cesar https://orcid.org/0000-0002-2212-4499 http://lattes.cnpq.br/9242867616608986 Previdelli, Isolde Terezinha Santos http://lattes.cnpq.br/0295127877081690 Souza, Reinaldo Castro http://lattes.cnpq.br/6992824817295435 Aprendizado de máquinas Otimização matemática Previsão Análise de séries temporais Machine learning Mathematical optimization Forecasting Time-series analysis CNPQ::ENGENHARIAS::ENGENHARIA DE PRODUCAO ENGENHARIAS Time series forecasting and analysis are helpful in the decision-making process. However, exogenous factors, nonlinearities, and seasonality make developing efficient forecasting models challenging. In this context, the use of machine learning approaches, especially ensemble learning models, becomes attractive. These approaches are based on the divide and conquer paradigm, where models are combined to generate an efficient model to solve regression (or time series forecasting), clustering, and classification tasks. This thesis aims to propose and evaluate the effectiveness of approaches based on ensemble learning models for time series forecasting problems. Four major contributions are presented in this thesis. The first one lies in applying and comparing bagging, boosting, and stacking ensemble learning methods with single models for the short-term forecasting of soybean and wheat prices for Parana state, Brazil. Also, the influence of exogenous variables to forecast these agricultural commodities is investigated. The second evaluates ensemble learning methods that consider the ensemble empirical mode decomposition, heterogeneous ensemble learning models, and multi-objective optimization to forecast the multi-stepahead monthly incidence of meningitis cases in four Brazilian states. In the third, two time series pre-processing techniques are integrated with heterogeneous ensemble learning models to perform electricity energy load forecasting. Finally, a competitive ensemble model that combines bagging and stacking ensemble learning models is proposed for multi-step wind power generation forecasting. The forecasting performance of the proposed models is computed through different criteria measures such as mean absolute error, mean squared error, mean absolute percentage error, relative mean squared error, symmetrical mean absolute percentage error, and relative mean squared error. Also, the Diebold-Mariano test is adopted to evaluate the statistical difference between the forecasting errors of the compared models. With the results achieved in the case studies, it is possible to identify that the ensemble learning methods can reach lower forecasting errors when compared with single models, non-decomposed, non-optimized, and decomposed homogeneous ensemble learning models. The ensemble learning methods achieved a forecasting error lower than 1% for the first application concerning the percentage errors. To forecast meningitis cases, the proposed ensemble learning has an error that spans 4.52%–12.77%. For the third application, the proposed approach reached errors between 0.80% and 7.79%. Finally, concerning the fourth application, the cooperative and competitive ensemble learning model has errors that span 13.81%–21.20%. Therefore, the results indicate that using ensemble learning methods is promising to improve forecasting accuracy. Also, the models proposed in this thesis can be adapted for other areas and used to help the decision-making process. Previsão e análise de séries temporais são importantes para o processo de tomada de decisão. Contudo, fatores exógenos, não-linearidades, e sazonalidade tornam o desenvolvimento de modelos de previsão eficientes uma tarefa desafiadora. Nesse cenário, o uso de abordagens relacionadas ao aprendizado de máquina, especialmente modelos de aprendizado por comitês de máquina torna-se atraente. Essas abordagens são pautadas no paradigma dividir para conquistar, onde os modelos individuais são combinados para gerar um modelo eficiente para resolver tarefas de regressão (ou previsão de série temporal), clusterização e classificação. Esta tese tem como objetivo propor e avaliar a eficácia de abordagens baseadas em modelos de aprendizagem por comitês de máquina para previsão de séries temporais. Quatro contribuições principais são apresentadas. A primeira consiste na aplicação e comparação dos métodos bagging, boosting e stacking com modelos individuais para previsão de curto prazo para preços de soja e trigo de no estado do Paraná, Brasil. Além disso, a influência de variáveis exógenas na previsão dessas commodities agrícolas é investigada. A segunda consiste em avaliar comitês de máquina que consideram o pré-processamento de sinais, comitês de máquina heterogêneos, e otimização multi-objetivo para prever a incidência mensal dos casos de meningite em quatro estados brasileiros. Por sua vez, na terceira contribuição, duas técnicas de pré-processamento são integradas com comitês de máquina heterogêneos para realizar a previsão de demanda de energia elétrica. Finalmente, um comitê de máquina competitivo que combina bagging e stacking é proposto para a previsão de geração de energia eólica. O desempenho dos modelos avaliados é calculado por meio de diferentes critérios como erro médio absoluto, erro quadrático médio, erro percentual médio absoluto, erro quadrático médio relativo, erro percentual absoluto médio simétrico, e erro quadrático médio relativo. Além disso, o teste Diebold-Mariano é adotado para avaliar a diferença estatística entre os erros de previsão dos modelos comparados. Os resultados empíricos sugerem que comitês de máquina podem atingir menores erros de previsão quando comparados aos modelos individuais, aos modelos de aprendizagem de conjunto homogêneos, sem pré-processamento, e não otimizado. Em termos de erros percentuais, na primeira contribuição as abordagens de comitê de máquina puderam alcançar um erro percentual menor que 1%, enquanto para o segundo estudo, o erro variou entre 4.52% e 12.77%. No que se refere a terceira aplicação, os modelos propostos atingiram erros entre 0.80% e 7.79%. Por fim, para previsão da geração de energia eólica, os erros percentuais variaram entre 13.81% e 21.20%. Os modelos adotados nesta tese podem ser adaptados para diferentes aplicações e serem suporte do processo de tomada de decisão. 2022-06-21T13:38:43Z 2022-06-21T13:38:43Z 2021-12-03 doctoralThesis RIBEIRO, Matheus Henrique Dal Molin. Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy. 2021. Tese (Doutorado em Engenharia de Produção e Sistemas) - Pontifícia Universidade Católica do Paraná, Curitiba, 2021. http://repositorio.utfpr.edu.br/jspui/handle/1/28866 eng openAccess application/pdf Pontifícia Universidade Católica do Paraná Pato Branco Brasil Programa de Pós-Graduação em Egenharia de Produção e Sistemas PUCPR
institution Universidade Tecnológica Federal do Paraná
collection RIUT
language Inglês
topic Aprendizado de máquinas
Otimização matemática
Previsão
Análise de séries temporais
Machine learning
Mathematical optimization
Forecasting
Time-series analysis
CNPQ::ENGENHARIAS::ENGENHARIA DE PRODUCAO
ENGENHARIAS
spellingShingle Aprendizado de máquinas
Otimização matemática
Previsão
Análise de séries temporais
Machine learning
Mathematical optimization
Forecasting
Time-series analysis
CNPQ::ENGENHARIAS::ENGENHARIA DE PRODUCAO
ENGENHARIAS
Ribeiro, Matheus Henrique Dal Molin
Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy
description Time series forecasting and analysis are helpful in the decision-making process. However, exogenous factors, nonlinearities, and seasonality make developing efficient forecasting models challenging. In this context, the use of machine learning approaches, especially ensemble learning models, becomes attractive. These approaches are based on the divide and conquer paradigm, where models are combined to generate an efficient model to solve regression (or time series forecasting), clustering, and classification tasks. This thesis aims to propose and evaluate the effectiveness of approaches based on ensemble learning models for time series forecasting problems. Four major contributions are presented in this thesis. The first one lies in applying and comparing bagging, boosting, and stacking ensemble learning methods with single models for the short-term forecasting of soybean and wheat prices for Parana state, Brazil. Also, the influence of exogenous variables to forecast these agricultural commodities is investigated. The second evaluates ensemble learning methods that consider the ensemble empirical mode decomposition, heterogeneous ensemble learning models, and multi-objective optimization to forecast the multi-stepahead monthly incidence of meningitis cases in four Brazilian states. In the third, two time series pre-processing techniques are integrated with heterogeneous ensemble learning models to perform electricity energy load forecasting. Finally, a competitive ensemble model that combines bagging and stacking ensemble learning models is proposed for multi-step wind power generation forecasting. The forecasting performance of the proposed models is computed through different criteria measures such as mean absolute error, mean squared error, mean absolute percentage error, relative mean squared error, symmetrical mean absolute percentage error, and relative mean squared error. Also, the Diebold-Mariano test is adopted to evaluate the statistical difference between the forecasting errors of the compared models. With the results achieved in the case studies, it is possible to identify that the ensemble learning methods can reach lower forecasting errors when compared with single models, non-decomposed, non-optimized, and decomposed homogeneous ensemble learning models. The ensemble learning methods achieved a forecasting error lower than 1% for the first application concerning the percentage errors. To forecast meningitis cases, the proposed ensemble learning has an error that spans 4.52%–12.77%. For the third application, the proposed approach reached errors between 0.80% and 7.79%. Finally, concerning the fourth application, the cooperative and competitive ensemble learning model has errors that span 13.81%–21.20%. Therefore, the results indicate that using ensemble learning methods is promising to improve forecasting accuracy. Also, the models proposed in this thesis can be adapted for other areas and used to help the decision-making process.
format Tese
author Ribeiro, Matheus Henrique Dal Molin
author_sort Ribeiro, Matheus Henrique Dal Molin
title Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy
title_short Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy
title_full Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy
title_fullStr Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy
title_full_unstemmed Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy
title_sort time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy
publisher Pontifícia Universidade Católica do Paraná
publishDate 2022
citation RIBEIRO, Matheus Henrique Dal Molin. Time series forecasting based on ensemble learning methods applied to agribusiness, epidemiology, energy demand, and renewable energy. 2021. Tese (Doutorado em Engenharia de Produção e Sistemas) - Pontifícia Universidade Católica do Paraná, Curitiba, 2021.
url http://repositorio.utfpr.edu.br/jspui/handle/1/28866
_version_ 1805301328746905600
score 10,814766