Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams

Esteban Rodríguez-Betancourt; Edgar Casasola-Murillo

Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams

Authors

Esteban Rodríguez-Betancourt Universidad de Costa Rica, Costa Rica
Edgar Casasola-Murillo Universidad de Costa Rica, Costa Rica

Keywords:

Databases, Indexes, Natural Language Processing, WordEmbeddings

Abstract

With the growing use of vector embeddings in areas like natural language processing and recommendation systems, the need for effective storage and retrieval methods is increasingly important. However, deploying specialized databases for vector indexing can be challenging due to resource limitations or operational constraints. This paper introduces a novel approach that utilizes existing trigram indexes within SQL databases to efficiently manage vector embeddings. By adapting traditional relational databases to handle high-dimensional data, organizations can use their existing infrastructure without the need to invest in new database systems. This method reduces management complexity and costs associated with maintaining separate systems for vector data. We outline the process of converting vector embeddings for trigram indexing and evaluate the performance and recall through empirical analysis. This paper aims to offer a practical solution for researchers and practitioners seeking to integrate advanced vector-based queries into their current database systems, thereby enhancing the functionality and accessibility of vector embeddings in mainstream applications.

Downloads

Published

2024-09-19

Issue

Vol. 10 No. 1 (2024): ASAID – Argentine Symposium on Artificial Intelligence and Big Data

Section

ASAID - Argentine Symposium on Artificial Intelligence and Data Science

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Acorde a estos términos, el material se puede compartir (copiar y redistribuir en cualquier medio o formato) y adaptar (remezclar, transformar y crear a partir del material otra obra), siempre que a) se cite la autoría y la fuente original de su publicación (revista y URL de la obra), b) no se use para fines comerciales y c) se mantengan los mismos términos de la licencia.

How to Cite

Rodríguez-Betancourt, E., & Casasola-Murillo, E. (2024). Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams. JAIIO, Jornadas Argentinas De Informática, 10(1), 150-157. https://revistas.unlp.edu.ar/JAIIO/article/view/17913

Download Citation

Teaching SQL New Tricks: Efficient Vector Indexing with Trigrams

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Latest publications

Language

indizaciones