File size: 747 Bytes
c6745d5 7814a47 eb30bc9 c6745d5 eb30bc9 c6745d5 7814a47 d5abf48 e468f5a eb30bc9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
---
title: SimianDB demo on wikipedia dataset 6M
emoji: 📚
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 3.23.0
app_file: app.py
pinned: false
license: mit
models:
- sentence-transformers/all-MiniLM-L6-v2
- cross-encoder/ms-marco-MiniLM-L-6-v2
datasets:
- wikipedia
---
This is a space to test the capabilities of my simple vector store database SimianDB.
The demo contains the first paragraph of the 6 million entry (6,458,670) wikipedia dataset "20220301.en"
The vectors have been compressed to 8 bits for efficient storage and the similarity calculation is done converting the vectors on-the fly to 32bits, with minor impact on speed.
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |