|
--- |
|
base_model: ytu-ce-cosmos/Turkish-LLaVA-v0.1 |
|
license: mit |
|
language: |
|
- tr |
|
tags: |
|
- LLaVA |
|
- llava_llama |
|
pipeline_tag: image-text-to-text |
|
--- |
|
|
|
# Turkish-LLaVA-v0.1-Q4_K_M-GGUF |
|
|
|
This model is a converted and quantized version of [ytu-ce-cosmos/Turkish-LLaVA-v0.1](https://huggingface.co/ytu-ce-cosmos/Turkish-LLaVA-v0.1) vision-language model using [llama.cpp](https://github.com/ggerganov/llama.cpp). |
|
|
|
## Usage |
|
|
|
You can use the model with [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) package as following: |
|
|
|
```py |
|
from llama_cpp import Llama |
|
from llama_cpp.llama_chat_format import Llama3VisionAlphaChatHandler |
|
|
|
llm = Llama( |
|
model_path="Turkish-LLaVA-v0.1-Q4_K_M.gguf", # path to language model |
|
n_gpu_layers=-1, # for running on GPU |
|
chat_handler=Llama3VisionAlphaChatHandler( |
|
# path to image encoder |
|
clip_model_path="Turkish-LLaVA-v0.1-mmproj-F16.gguf", |
|
), |
|
seed=1337, # for reproducing same results |
|
n_ctx=4096, # n_ctx should be increased to accommodate the image embedding |
|
verbose=False, # disable the logging |
|
) |
|
|
|
# url for the input image |
|
url = "https://huggingface.co/ytu-ce-cosmos/Turkish-LLaVA-v0.1/resolve/main/example.jpg" |
|
|
|
messages = [ |
|
{"role": "system", "content": "Sen yardımsever bir asistansın."}, |
|
{ |
|
"role": "user", |
|
"content": [ |
|
{"type" : "text", "text": "Bu resimde neler görüyorsun?"}, |
|
{"type": "image_url", "image_url": {"url": url}} |
|
] |
|
}, |
|
] |
|
|
|
response = llm.create_chat_completion( |
|
messages=messages, |
|
max_tokens=64, |
|
) |
|
|
|
print(response["choices"][0]["message"]["content"]) |
|
# Output: Resimde, sarı çiçeklerle çevrili bir köpek yavrusu görülüyor. |
|
``` |