Text Mining for Classification and Sentiment Analysis of Personal Stories
Keywords:
Text Mining, Machine Learning, Classification, Sentiment AnalysisAbstract
This work aims to implement tools and machine learning techniques to automate the process of analyzing the narratives collected in three editions of the book "Matilda and Women in Engineering in Latin America." The goal is to identify factors that influence the choice and practice of an engineering career by women. The methodology will follow the proposed guidelines for a Knowledge Discovery in Texts (KDT) process. The work will be divided into several stages: understanding the application domain, data extraction, cleaning, processing and transformation of data, and model development. Currently, the project is in the phase of constructing the corpus and removing non-significant patterns of information. Next, the text will be tokenized to understand its characteristics, and the most suitable technique for quantifying the set of words present in the corpus will be evaluated. A supervised machine learning model will be built to predict the main theme of the narrative, and its sentiment will be analyzed based on that theme. The sentiment analysis will be performed by considering sentiment as the sum of the sentiments of each of the words that compose it.
Downloads
Published
Issue
Section
License
Copyright (c) 2023 Adriana Soledad Ruiz Diaz, Miguel Mendez Garabetti

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Acorde a estos términos, el material se puede compartir (copiar y redistribuir en cualquier medio o formato) y adaptar (remezclar, transformar y crear a partir del material otra obra), siempre que a) se cite la autoría y la fuente original de su publicación (revista y URL de la obra), b) no se use para fines comerciales y c) se mantengan los mismos términos de la licencia.











