Estudo comparativo da morfologia textual de laudos médicos e de textos de propósito geral

Context: The understanding of human speech by computers has been a research question for years and can be applied to different goals. This monograph uses Natural Language Processing (NLP) to perform a morphological survey of the content of medical reports. This data can help future studies of this t...

ver descrição completa

Autor principal: Oliveira, Gabrielle Piezzoti
Formato: Trabalho de Conclusão de Curso (Graduação)
Idioma: Português
Publicado em: Universidade Tecnológica Federal do Paraná 2020
Assuntos:
Acesso em linha: http://repositorio.utfpr.edu.br/jspui/handle/1/5994
Tags: Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
Resumo: Context: The understanding of human speech by computers has been a research question for years and can be applied to different goals. This monograph uses Natural Language Processing (NLP) to perform a morphological survey of the content of medical reports. This data can help future studies of this type of text, since there is a lack of analysis about medical reports in the literature. Objective: The goal is to identify the predominant features of this type of text and analyze how it differs from other types of texts. In order to do this, an academic and a journalistic corpora will be used. Method: The first step was the formatting of these corpora, so they are compatible with the analysis tool, followed by an cleanup for correction and removal of unwanted elements. The medical reports required extra steps, for the selection of the 500 reports that compose the corpus. The next thing is the processing of the corpora and the gathering of the following informations: the most frequent parts of speech, lemmas, unigrams, bigrams and trigrams. Results: The statistics extracted from the data provided by the tool show a considerable amount of variation between the morphological content of the medical reports and the other texts, which reinforces the empirical hypotheses: there is a significant difference in the specification of the medical reports and the proportion of adjectives, verbs and numbers, compared to the other types of texts. Conclusion: The analysis described the morphological profile of the medical reports as being a type of text with more nouns, numbers and adjectives than the others, but with fewer verbs, pronouns and determinants. Moreover, they have simpler and more direct sentences, with a limited vocabulary. Further research could be done, varying the corpora, to enrich the morphological analysis presented in this monograph.