Automatic Categorization of Medical Documents
Keywords:
Approximate String Matching,, Assignment Graph, Automatic Categorization, Automatic Coding, Controlled Vocabulary, Hierarchical Model for Categorization of Medical Documents, Hierarchical Terms, Information Retrieval, MedCode, HiMeD Model, Medical Document Databases, Medical Informatics, Vector Space ModelAbstract
The main objective of this thesis is to propose a categorizing model for medical documents, called HiMeD Model. The HiMeD Model is based on the principle that we denominated hierarchical correlation of specialized terms, in which a medical concept, to be used in an automatic categorization process, can always be represented by terms, where these terms are linked up in a hierarchical path. This hierarchical linking can contain components that allow the determination of these categories ordered by the degree of relevance of the adopted concept. The use of this principle allows us to isolate the categorization tasks from the unnecessary influence of terms not belonging to the medical vocabulary of reference and of the straight calculation of the term-weight in the information retrieval process used by the classic models. The concepts developed here were used in several experiments that demonstrated the quality of the proposed model. These experiments are another important contribution of this work. Finally, a tool for automatic coding of medical documents was implemented based on the components of our model, thus demonstrating its technological capacity in building automatic categorization tools. This tool, called MedCode, was used in experiments carried out with the help of medical coding specialists, and its use improved the precision of the automatic coding of medical documents. This improvement is largely due to the interactive and visual characteristics of the prototype, which allowed the specialists to modify the coding environment, to select the type of processing algorithm, and to modify other document processing options.
Downloads
Published
Issue
Section
License
Copyright (c) 2000 Luciano Romero Soares de Lima

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Those authors who have publications with this journal, agree with the following terms:
a. Authors will retain its copyright and will ensure the rights of first publication of its work to the journal, which will be at the same time subject to the Creative Commons Atribución-NoComercial-CompartirIgual 4.0 Internacional (CC BY-NC-SA 4.0) allowing third parties to share the work as long as the author and the first publication on this journal is indicated.
b. Authors may elect other non-exclusive license agreements of the distribution of the published work (for example: locate it on an institutional telematics file or publish it on an monographic volume) as long as the first publication on this journal is indicated,
c. Authors are allowed and suggested to disseminate its work through the internet (for example: in institutional telematics files or in their website) before and during the submission process, which could produce interesting exchanges and increase the references of the published work. (see The effect of open Access)















