Método de redução de dimensionalidade de dados derivados do domínio de expressão gênica
The following research is in the context of a production cycle of a clinical laboratory in the genomic area. This work proposed a dimensionality reduction method that helps the diagnosis for the genomic laboratory medicine. The new method of data size reduction is called DRM-F and it is able to iden...
Autor principal: | Macedo, Dayana Carla de |
---|---|
Formato: | Tese |
Idioma: | Português |
Publicado em: |
Universidade Tecnológica Federal do Paraná
2017
|
Assuntos: | |
Acesso em linha: |
http://repositorio.utfpr.edu.br/jspui/handle/1/2481 |
Tags: |
Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
|
Resumo: |
The following research is in the context of a production cycle of a clinical laboratory in the genomic area. This work proposed a dimensionality reduction method that helps the diagnosis for the genomic laboratory medicine. The new method of data size reduction is called DRM-F and it is able to identify on bases of this domain the most relevant (gene) attributes, by means of equivalence and generalization concepts. The DRM-F Method was compared to the Attributes Selection Method. This comparison aimed to assess the proposed method with the existing method for data mining, Attributes Selection. In the DRM-F Method based on Framework used equivalence and generalization concepts. These two methods have been applied in the domain of gene expression using three bases, named DLBCL, DLBCL tumor regarding leukemia and ALL/AML containing lymphoma data. Analyzing the results, using as assessment criteria the Cross Validation, it was found that the use of the methods resulted in an improvement in the hit ratio values as compared with bases having all the attributes in the domain of gene expression. In this area the best reduction method was achieved by using the wrapper approach in the three bases. Nevertheless, it is noteworthy that the proposed method showed a result of over 80% in accuracy rate, which cannot be considered a reduction method with poor performance. Although the DRM-F Method presented results below the attributes selection method, in general it showed no average hit rate lower than 80% in the generation of predictive models. The DRM-F Method, aims to extract common and specific (gene) attributes from the field of study, gene expression, but not only from a single base, but among all the bases belonging to the domain. Thus, one can obtain the (gene) attributes common among the various diseases analyzed between the bases. In the present experiment, it was possible to extract the common and specific (gene) attributes among the analyzed diseases. With the (gene) attributes common and specific to each disease it is possible to submit these subsets to biological analysis in order to verify the biological significance of the attributes with the objective of contributing to the area of biomedical diagnostics and routing. |
---|