|
--- |
|
title: Self Chat |
|
emoji: π€~π€ |
|
colorFrom: yellow |
|
colorTo: purple |
|
sdk: gradio |
|
sdk_version: 4.44.0 |
|
app_file: app.py |
|
pinned: false |
|
license: apache-2.0 |
|
tags: |
|
- chatbot |
|
short_description: Generating synthetic data via self-chatting |
|
--- |
|
|
|
|
|
|
|
|
|
## Dependency |
|
|
|
Install llama-cpp-python with the following script |
|
```sh |
|
pip install git+https://github.com/abetlen/llama-cpp-python.git -C cmake.args="-DGGML_BLAS=ON;-DGGML_BLAS_VENDOR=OpenBLAS" |
|
``` |
|
|
|
|
|
|
|
|
|
## Local Inference |
|
|
|
```sh |
|
python models/cpp_qwen2.py |
|
``` |
|
|
|
## Serverless Inference |
|
|
|
|
|
```sh |
|
python client_gradio.py |
|
``` |
|
|
|
|
|
For streaming inference |
|
```sh |
|
python client_streaming.py |
|
``` |
|
|
|
|
|
|