Deep learning methods for detecting anomalies in videos: theoretical and methodological contributions
The anomaly detection in automated video surveillance is a recurrent topic in recent computer vision research. Deep Learning (DL) methods have achieved the state-of-the-art performance for pattern recognition in images and the Convolutional Autoencoder (CAE) is one of the most frequently used approa...
Autor principal: | Ribeiro, Manassés |
---|---|
Formato: | Tese |
Idioma: | Português |
Publicado em: |
Universidade Tecnológica Federal do Paraná
2018
|
Assuntos: | |
Acesso em linha: |
http://repositorio.utfpr.edu.br/jspui/handle/1/3172 |
Tags: |
Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
|
Resumo: |
The anomaly detection in automated video surveillance is a recurrent topic in recent computer vision research. Deep Learning (DL) methods have achieved the state-of-the-art performance for pattern recognition in images and the Convolutional Autoencoder (CAE) is one of the most frequently used approach, which is capable of capturing the 2D structure of objects. In this work, anomaly detection refers to the problem of finding patterns in images and videos that do not belong to the expected normal concept. Aiming at classifying anomalies adequately, methods for learning relevant representations were verified. For this reason, both the capability of the model for learning automatically features and the effect of fusing hand-crafted features together with raw data were studied. Indeed, for real-world problems, the representation of the normal class is an important issue for detecting anomalies, in which one or more clusters can describe different aspects of normality. For classification purposes, these clusters must be as compact (dense) as possible. This thesis proposes the use of CAE as a data-driven approach in the context of anomaly detection problems. Methods for feature learning using as input both hand-crafted features and raw data were proposed, and how they affect the classification performance was investigated. This work also introduces a hybrid approach using DL and one-class support vector machine methods, named Convolutional Autoencoder with Compact Embedding (CAE-CE), for enhancing the compactness of normal clusters. Besides, a novel sensitivity-based stop criterion was proposed, and its suitability for anomaly detection problems was assessed. The proposed methods were evaluated using publicly available datasets and compared with the state-of-the-art approaches. Two novel benchmarks, designed for video anomaly detection in highways were introduced. CAE was shown to be promising as a data-driven approach for detecting anomalies in videos. Results suggest that the CAE can learn spatio-temporal features automatically, and the aggregation of hand-crafted features seems to be valuable for some datasets. Also, overall results suggest that the enhanced compactness introduced by the CAE-CE improved the classification performance for most cases, and the stop criterion based on the sensitivity is a novel approach that seems to be an interesting alternative. Videos were qualitatively analyzed at the visual level, indicating that features learned using both methods (CAE and CAE-CE) are closely correlated to the anomalous events occurring in the frames. In fact, there is much yet to be done towards a more general and formal definition of normality/abnormality, so as to support researchers to devise efficient computational methods to mimetize the semantic interpretation of visual scenes by humans. |
---|