metadata
title: SimianDB demo on wikipedia dataset 6M
emoji: π
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 3.23.0
app_file: app.py
pinned: false
license: mit
models:
- sentence-transformers/all-MiniLM-L6-v2
- cross-encoder/ms-marco-MiniLM-L-6-v2
datasets:
- wikipedia
This is a space to test the capabilities of my simple vector store database SimianDB. The demo contains the first paragraph of the 6 million entry (6,458,670) wikipedia dataset "20220301.en" The vectors have been compressed to 8 bits for efficient storage and the similarity calculation is done converting the vectors on-the fly to 32bits, with minor impact on speed.
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference