Analysis of the impact of the data cleaning process on malnutrition indicators

Authors

  • Agustín Nicolás Dramis Universidad de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Argentina
  • María Soledad Fernández Universidad de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Argentina
  • Adriana Alicia Pérez Universidad de Buenos Aires, Argentina
  • Pablo Guillermo Turjanski Universidad de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Argentina

Keywords:

Data Quality , Simulation , Anthropometric data

Abstract

The systematic recording of anthropometric measurements allows the evaluation of the nutritional status of populations, providing a fundamental input for designing, directing, and evaluating public policies. Anthropometric measurements are usually collected through a manual entry process by healthcare professionals. This process can lead to data entry errors, potentially impacting the assessment of the population's nutritional status. To address this issue, the WHO introduced guidelines for the removal of individually implausible data. However, these guidelines are not considered sufficient for detecting all errors. There are methods available that can detect longitudinal inconsistencies within records of the same individual. In this study, we simulated an anthropometric database (based on a real one) and randomly introduced four types of errors described in the literature. We observed the impact of these errors and the effects of the cleaning process (both cross-sectional and longitudinal) on the prevalence of a malnutrition indicator. We found an increase in the prevalence after introducing each type of error, and a convergence towards the original prevalence values after applying the cleaning processes. This highlights the importance of implementing data cleaning procedures before analyzing nutritional indicators.

Downloads

Published

2023-07-21

Issue

Section

CAIS - Congreso Argentino de Informática y Salud

How to Cite

Dramis, A. N., Fernández, M. S., Pérez, A. A., & Turjanski, P. G. (2023). Analysis of the impact of the data cleaning process on malnutrition indicators. JAIIO, Jornadas Argentinas De Informática, 9(5), 20-27. https://revistas.unlp.edu.ar/JAIIO/article/view/18128