prithivida
commited on
Commit
•
44f67a3
1
Parent(s):
1bebdbc
Update README.md
Browse files
README.md
CHANGED
@@ -156,7 +156,7 @@ for query, query_embedding in zip(queries, query_embeddings):
|
|
156 |
# FAQS
|
157 |
|
158 |
#### How can I reduce overall inference cost ?
|
159 |
-
- You can host these models without heavy torch dependency using the ONNX flavours of these models via [
|
160 |
|
161 |
#### How do I reduce vector storage cost ?
|
162 |
[Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)
|
|
|
156 |
# FAQS
|
157 |
|
158 |
#### How can I reduce overall inference cost ?
|
159 |
+
- You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashEmbed](https://github.com/PrithivirajDamodaran/flashembed) library.
|
160 |
|
161 |
#### How do I reduce vector storage cost ?
|
162 |
[Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)
|