Classificação de ações em vídeos por meio de redes neurais convolucionais baseadas em grafos

Video classification methods have been evolving through proposals based on end-to-end architectures for deep learning. Many academic works have validated that such end-to-end models are effective for the learning of characteristics intrinsic to videos, especially when compared to traditional, handcr...

ver descrição completa

Autor principal: Costa, Felipe Franco
Formato: Dissertação
Idioma: Português
Publicado em: Universidade Tecnológica Federal do Paraná 2022
Assuntos:
Acesso em linha: http://repositorio.utfpr.edu.br/jspui/handle/1/30169
Tags: Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
Resumo: Video classification methods have been evolving through proposals based on end-to-end architectures for deep learning. Many academic works have validated that such end-to-end models are effective for the learning of characteristics intrinsic to videos, especially when compared to traditional, handcrafted, descriptors. In general, convolutional neural networks are used for deep learning in videos. When applied to such contexts, the networks can display variations based on temporal information, based memory cells (e.g. long-short term memory), or even optical flow techniques used in conjunction with the convolution process. However, despite its effectiveness, those methods neglect global analysis, processing only a small quantity of frames in each batch during the learning and inference process. Moreover, they also completely ignore the semantic relationship between different videos that belong to the same context. Thus, the present work aims to fill the existing gaps by using concepts of information grouping and contextual detection through graph-based convolutional neural networks (GCN). With these architectures we hope to propose new approaches to create and explore the relationship between different videos of a given context, improving the state-of-the-art in the process.