File size: 1,415 Bytes
99ee678 d8239eb 99ee678 df17aca 99ee678 b1d3211 99ee678 2e96765 99ee678 a9ecd10 99ee678 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
---
language: pl
datasets:
- enelpol/czywiesz
---
# Model description
The model was created for selective question answering in Polish. I.e. it is used to find passages containing the answers to the given question.
It is used to encode the contexts (aka passages) in the DPR bi-encoder architecture. The architecture requires two separate models.
The question part has to be encoded with the corresponding [question encoder](https://huggingface.co/enelpol/czywiesz-question).
The model was created by fine-tuning [Herbert base cased](https://huggingface.co/allegro/herbert-base-cased) on "Czywiesz" dataset.
[Czywiesz](https://clarin-pl.eu/dspace/handle/11321/39) dataset contains questions and Wikipedia articles extracted from the Polish Wikipedia.
# Usage
It is the easiest to use the model with the [Haystack framework](https://haystack.deepset.ai/overview/intro).
```python
from haystack.document_stores import FAISSDocumentStore
from haystack.retriever import DensePassageRetriever
document_store = FAISSDocumentStore(faiss_index_factory_str="Flat")
retriever = DensePassageRetriever(
document_store=document_store,
query_embedding_model="enelpol/czywiesz-question",
passage_embedding_model="enelpol/czywiesz-context"
)
for document in documents:
document_store.write_documents([document])
document_store.update_embeddings(retriever)
document_store.save("contexts.faiss")
``` |