Classificação do nível de crescimento de colônias de fungos em meio sólido: uma abordagem baseada em aprendizado de máquina

The measurement of colony growth in solid state is a common technique applied in atudies that develops control agents of pathogenic fungi. The measurement procedures usually envolves the visual identification and manual measurement of colonies in petri dishes. Recently, some measurement techniques h...

ver descrição completa

Autor principal: Vismara, Edgar de Souza
Formato: Trabalho de Conclusão de Curso (Especialização)
Idioma: Português
Publicado em: Universidade Tecnológica Federal do Paraná 2022
Assuntos:
Acesso em linha: http://repositorio.utfpr.edu.br/jspui/handle/1/30124
Tags: Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
Resumo: The measurement of colony growth in solid state is a common technique applied in atudies that develops control agents of pathogenic fungi. The measurement procedures usually envolves the visual identification and manual measurement of colonies in petri dishes. Recently, some measurement techniques have been developed based on the segmentation of colony images. This segmentation is done by applying digital image analysis techniques or machine learning models (ML). These approaches have two things in common: high controlled enviroment where the images were obtained; and a segmented image as final output, which in the case of AM, is only possible through the application of an exhaustive method of manual labeling at the pixel level. In addition, those ML-based studies little explores the importance of image features in the classification process and they also tests a very limited range of ML algorithms. An interesting characteristic of ML is the fact that it allows to perform the classification tasks of entire images without the need of pixel-level labeling. Thus, this work proposes a classification method of fungal growth, based on ML, which performs this task in entire images obtained with any control of the luminosity conditions. This method was applied to a set of 537 images of petri dishes incubated with Botrytis cinerea and obtained in an experiment ran in the phytopathology laboratory of UTFPR/DV. The images were pre-processed and from them were extracted 94 features that gave rise to four data sets: "Color channels", "Histograms", "Remain features" (edge + texture) and "Complete" (all features). From these, were removed constant, identifiers and self-correlated features (threshold of 0.85). Also a new data set was created by aggregating the first three. The labeling process considered 3 growth levels and was performed by a specialist. For each of the 5 datasets, 9 ML algorithms (including two baselines) were trained through a k-fold cross-validation procedure with k = n = 10, producing 45 trained models. In order to compare the performance of the models, the balanced accuracy was computed and it values were submitted to comparison through Kruskal-Wallis test. From all trained models, 16 showed balanced accuracy above 0.8 and the top 11 showed no difference at 5% of significance by the statistical test. Among the tested features, those related to the color chanels were the most relevant according to the importance values computed in the Random Forest and because there are 6 models trained only with these features on the top 11. Among all, the most important feature was the standard deviation of the channel intensity. Finally, from these 11 top models we selected 3 according to the complexity of its algorithm and always using the color channels as features. They were: Multinomial, k-NN and SVM. At the end these models were combined trough a voting procedure and used for prediction, obtaining a balanced accuracy of 0.9099667.