Embeddings
#7
by
JuanEmilio
- opened
Hi everyone, I'm trying to implement this model for a personal project and right now I'm working with a huge DB. I don't know how can I give the model all the necessary information (the metadata from my db) given that there is too much information.
There is any way to implement embeddings or vectors for this model?
Hi @JuanEmilio , you can check out how we currently prune the metadata in this file (pruning.py). We basically embed each column with their descriptions (albeit with a much smaller encoder model), and then get the top k nearest neighbors based on cosine similarity. You can replace / extend the knn functionality with faiss or other tools that scale better. Hope this helps!
Fantastic, I'll try it. Thank you so much
JuanEmilio
changed discussion status to
closed