Dense Vectors Dimensions Ranking for Efficient Retrieval

Authors

Keywords:

neural IR, dense vectors, dimensions ranking, efficiency

Abstract

Information retrieval over collections of millions of documents is a computationally intensive task. The emergence of dense representations (embeddings) enables the construction of vectors with hundreds of dimensions shifting the retrieval task into a nearest-neighbour vector search problem. We hypothesize that not all the embedding dimensions are equally important for the retrieval task and, therefore some could be pruned. In this paper, we propose to rank embedding dimensions based on their importance and evaluate different pruning methods following an objective effectiveness requirement.
Based on widely used models for generating embeddings and well-known document collections, our experiments show that it is possible to reduce the size of vectors by up to 50% while maintaining the effectiveness of up to 90 %, thus improving efficiency.

Downloads

Published

2025-10-15

How to Cite

Delvechio, T., Rissola, E., & Tolosa, G. (2025). Dense Vectors Dimensions Ranking for Efficient Retrieval. JAIIO, Jornadas Argentinas De Informática, 11(1), 22-26. https://revistas.unlp.edu.ar/JAIIO/article/view/19736