Ranking de dimensiones en vectores densos para recuperaci´on eficiente

Tom´as Delvechio; Esteban Rissola; Gabriel Tolosa

Dense Vectors Dimensions Ranking for Efficient Retrieval

Authors

Tom´as Delvechio Universidad Nacional de Luján, Argentina https://orcid.org/0009-0005-2589-3436
Esteban Rissola Universidad Nacional de Luján, Argentina https://orcid.org/0000-0001-7072-4096
Gabriel Tolosa Universidad Nacional de Luján, Argentina https://orcid.org/0000-0001-8237-7554

Keywords:

neural IR, dense vectors, dimensions ranking, efficiency

Abstract

Information retrieval over collections of millions of documents is a computationally intensive task. The emergence of dense representations (embeddings) enables the construction of vectors with hundreds of dimensions shifting the retrieval task into a nearest-neighbour vector search problem. We hypothesize that not all the embedding dimensions are equally important for the retrieval task and, therefore some could be pruned. In this paper, we propose to rank embedding dimensions based on their importance and evaluate different pruning methods following an objective effectiveness requirement.
Based on widely used models for generating embeddings and well-known document collections, our experiments show that it is possible to reduce the size of vectors by up to 50% while maintaining the effectiveness of up to 90 %, thus improving efficiency.

Downloads

PDF (Spanish)

Published

2025-10-15

Issue

Vol. 11 No. 1 (2025): ASAID – Argentine Symposium on Artificial Intelligence and Big Data

Section

Original papers

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Acorde a estos términos, el material se puede compartir (copiar y redistribuir en cualquier medio o formato) y adaptar (remezclar, transformar y crear a partir del material otra obra), siempre que a) se cite la autoría y la fuente original de su publicación (revista y URL de la obra), b) no se use para fines comerciales y c) se mantengan los mismos términos de la licencia.

How to Cite

Delvechio, T., Rissola, E., & Tolosa, G. (2025). Dense Vectors Dimensions Ranking for Efficient Retrieval. JAIIO, Jornadas Argentinas De Informática, 11(1), 22-26. https://revistas.unlp.edu.ar/JAIIO/article/view/19736

Download Citation