Embeddings

by JuanEmilio - opened Jan 10

Jan 10

Hi everyone, I'm trying to implement this model for a personal project and right now I'm working with a huge DB. I don't know how can I give the model all the necessary information (the metadata from my db) given that there is too much information.
There is any way to implement embeddings or vectors for this model?

jp-defog

Defog.ai org Jan 11

Hi @JuanEmilio , you can check out how we currently prune the metadata in this file (pruning.py). We basically embed each column with their descriptions (albeit with a much smaller encoder model), and then get the top k nearest neighbors based on cosine similarity. You can replace / extend the knn functionality with faiss or other tools that scale better. Hope this helps!

JuanEmilio

Jan 13

Fantastic, I'll try it. Thank you so much

JuanEmilio changed discussion status to closed Jan 13

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment