Dense Vectors Dimensions Ranking for Efficient Retrieval
Keywords:
neural IR, dense vectors, dimensions ranking, efficiencyAbstract
Information retrieval over collections of millions of documents is a computationally intensive task. The emergence of dense representations (embeddings) enables the construction of vectors with hundreds of dimensions shifting the retrieval task into a nearest-neighbour vector search problem. We hypothesize that not all the embedding dimensions are equally important for the retrieval task and, therefore some could be pruned. In this paper, we propose to rank embedding dimensions based on their importance and evaluate different pruning methods following an objective effectiveness requirement.
Based on widely used models for generating embeddings and well-known document collections, our experiments show that it is possible to reduce the size of vectors by up to 50% while maintaining the effectiveness of up to 90 %, thus improving efficiency.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Tom´as Delvechio, Esteban Rissola, Gabriel Tolosa

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Acorde a estos términos, el material se puede compartir (copiar y redistribuir en cualquier medio o formato) y adaptar (remezclar, transformar y crear a partir del material otra obra), siempre que a) se cite la autoría y la fuente original de su publicación (revista y URL de la obra), b) no se use para fines comerciales y c) se mantengan los mismos términos de la licencia.











