Spaces:
Configuration error
Configuration error
## Question Answering Application for Healthcare | |
This is a streamlit-based NLP application powering a question answering demo on healthcare data. It's easy to change and extend and can be used to try out Haystack's capabilities. | |
A video presentation of this demo is available on [YouTube](https://www.youtube.com/watch?v=pOnkGdOvYfo). To get started with Haystack please visit the [README](https://github.com/deepset-ai/haystack/tree/main#key-components) or check out our [tutorials](https://haystack.deepset.ai/tutorials/first-qa-system). | |
## Usage | |
The easiest way to run the application is through [Docker compose](https://docs.docker.com/compose/). | |
From this folder, just run: | |
```sh | |
docker compose up -d | |
``` | |
Docker will start three containers: | |
- `elasticsearch`, running an Elasticsearch instance with some data pre-loaded. | |
- `haystack-api`, running a pre-loaded Haystack pipeline behind a RESTful API. | |
- `ui`, running the streamlit application showing the UI and querying Haystack under the hood. | |
Once all the containers are up and running, you can open the user interface pointing your | |
browser to [http://localhost:8501](http://localhost:8501). | |
## Screencast | |
https://user-images.githubusercontent.com/4181769/231965471-48d581a2-e1aa-4316-b3a4-990d9c86800e.mov | |
## Evaluation Mode | |
The evaluation mode leverages the feedback REST API endpoint of haystack. The user has the options | |
"Wrong answer", "Wrong answer and wrong passage" and "Wrong answer and wrong passage" to give | |
feedback. | |
In order to use the UI in evaluation mode, you need an ElasticSearch instance with pre-indexed files | |
and the Haystack REST API. You can set the environment up via docker images. For ElasticSearch, you | |
can check out our [documentation](https://haystack.deepset.ai/usage/document-store#initialisation) | |
and for setting up the REST API this [link](https://github.com/deepset-ai/haystack/blob/main/README. | |
md#7-rest-api). | |
To enter the evaluation mode, select the checkbox "Evaluation mode" in the sidebar. The UI will load | |
the predefined questions from the file [`eval_labels_examples`](https://raw.githubusercontent.com/ | |
deepset-ai/haystack/main/ui/ui/eval_labels_example.csv). The file needs to be prefilled with your | |
data. This way, the user will get a random question from the set and can give his feedback with the | |
buttons below the questions. To load a new question, click the button "Get random question". | |
The file just needs to have two columns separated by semicolon. You can add more columns but the UI | |
will ignore them. Every line represents a questions answer pair. The columns with the questions needs | |
to be named “Question Text” and the answer column “Answer” so that they can be loaded correctly. | |
Currently, the easiest way to create the file is manually by adding question answer pairs. | |
The feedback can be exported with the API endpoint `export-doc-qa-feedback`. To learn more about | |
finetuning a model with user feedback, please check out our [docs](https://haystack.deepset.ai/usage/ | |
domain-adaptation#user-feedback). | |
## Query different data | |
If you want to use this application to query a different corpus, the easiest way is to build the | |
Elasticsearch image, load your own text data and then use the same Compose file to run all the | |
three containers needed. This will require [Docker](https://docs.docker.com/get-docker/) to be | |
properly installed on your machine. | |
### Running your custom build | |
Once done, modify the `elasticsearch` section in the `docker-compose.yml` file, changing this line: | |
```yaml | |
image: "julianrisch/elasticsearch-healthcare" | |
``` | |
to: | |
```yaml | |
image: "my-docker-acct/elasticsearch-custom" | |
``` | |
Finally, run the compose file as usual: | |
```sh | |
docker-compose up | |
``` | |
## Development | |
If you want to change the streamlit application, you need to setup your Python environment first. | |
From a virtual environment, run: | |
```sh | |
pip install -e . | |
``` | |
The app requires the Haystack RESTful API to be ready and accepting connections at `http://localhost:8000`, you can use Docker compose to start only the required containers: | |
```sh | |
docker-compose up elasticsearch haystack-api | |
``` | |
At this point you should be able to make changes and run the streamlit application with: | |
``` | |
streamlit run ui/webapp.py | |
``` | |
## Using GPUs with Docker | |
Assuming you have [nvidia drivers installed](https://developer.nvidia.com/cuda-downloads) on your machine, you can configure docker to use the GPU for the Haystack API container to speed it up. | |
First, configure the nvidia repository as described here: https://nvidia.github.io/nvidia-container-runtime/. For example: | |
```sh | |
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \ | |
sudo apt-key add - | |
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) | |
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \ | |
sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list | |
sudo apt-get update | |
``` | |
Then, install nvidia-container-runtime as described here: https://docs.docker.com/config/containers/resource_constraints/#access-an-nvidia-gpu. | |
For example: | |
```sh | |
sudo apt-get install nvidia-container-runtime | |
``` | |
Restart the Docker daemon (or simply the machine). | |
Finally, you can change the docker compose file `healthcare/docker-compose.yml` so that a docker image prepared for usage with GPUs is used and one GPU is reserved for the Haystack API container: | |
```yaml | |
haystack-api: | |
image: "deepset/haystack:gpu-v1.14.0" | |
ports: | |
- 8000:8000 | |
restart: on-failure | |
volumes: | |
- ./haystack-api:/home/node/app | |
environment: | |
- DOCUMENTSTORE_PARAMS_HOST=elasticsearch | |
- PIPELINE_YAML_PATH=/home/node/app/pipelines_biobert.haystack-pipeline.yml | |
depends_on: | |
elasticsearch: | |
condition: service_healthy | |
deploy: | |
resources: | |
reservations: | |
devices: | |
- driver: nvidia | |
count: 1 | |
capabilities: [gpu] | |
``` | |