|
--- |
|
library_name: peft |
|
base_model: Qwen/Qwen-VL-Chat-Int4 |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
|
|
- **Developed by:** Agora Research |
|
- **Model type:** Vision Language Model |
|
- **Language(s) (NLP):** English/Chinese |
|
- **Finetuned from model:** Qwen-VL |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** https://github.com/QwenLM/Qwen-VL |
|
- **Paper:** https://arxiv.org/pdf/2308.12966.pdf |
|
|
|
## Uses |
|
``` |
|
import peft |
|
from peft import AutoPeftModelForCausalLM |
|
from transformers import AutoTokenizer |
|
from transformers.generation import GenerationConfig |
|
``` |
|
# Note: The default behavior now has injection attack prevention off. |
|
``` |
|
tokenizer = AutoTokenizer.from_pretrained("qwen/Qwen-VL",trust_remote_code=True) |
|
|
|
model = AutoPeftModelForCausalLM.from_pretrained( |
|
"Qwen-VL-FNCall-qlora/", # path to the output directory |
|
device_map="cuda", |
|
fp16=True, |
|
trust_remote_code=True |
|
).eval() |
|
``` |
|
# Specify hyperparameters for generation (generation_config if transformers < 4.32.0) |
|
``` |
|
#model.generation_config = GenerationConfig.from_pretrained("Qwen/Qwen-VL-Chat", trust_remote_code=True) |
|
|
|
|
|
# 1st dialogue turn |
|
query = tokenizer.from_list_format([ |
|
{'image': 'https://images.rawpixel.com/image_800/cHJpdmF0ZS9sci9pbWFnZXMvd2Vic2l0ZS8yMDIzLTA4L3Jhd3BpeGVsX29mZmljZV8xNV9waG90b19vZl9hX2RvZ19ydW5uaW5nX3dpdGhfb3duZXJfYXRfcGFya19lcF9mM2I3MDQyZC0zNWJlLTRlMTQtOGZhNy1kY2Q2OWQ1YzQzZjlfMi5qcGc.jpg'}, # Either a local path or an url |
|
{'text': "[FUNCTION CALL]"}, |
|
]) |
|
print("sending model to chat") |
|
response, history = model.chat(tokenizer, query=query, history=None) |
|
print(response) |
|
``` |
|
|
|
# Print Results |
|
``` |
|
[FUNCTION CALL] |
|
{{ |
|
'type': 'object', |
|
'properties': {{ |
|
'puppy_colors': {{ |
|
'type': 'array', |
|
'description': 'The colors of the puppies in the image.', |
|
'items': {{ |
|
'type': 'string', |
|
'enum': ['golden'] |
|
}} |
|
}}, |
|
'puppy_posture': {{ |
|
'type': 'string', |
|
'description': 'The posture of the puppies in the image.', |
|
'enum': ['sitting'] |
|
}}, |
|
'puppy_expression': {{ |
|
'type': 'string', |
|
'description': 'The expression of the puppies in the image.', |
|
'enum': ['smiling'] |
|
}}, |
|
'puppy_location': {{ |
|
'type': 'string', |
|
'description': 'The location of the puppies in the image.', |
|
'enum': ['on a green field with orange flowers'] |
|
}}, |
|
'puppy_background': {{ |
|
'type': 'string', |
|
'description': 'The background of the puppies in the image.', |
|
'enum': ['green field with orange flowers'] |
|
}} |
|
}} |
|
}} |
|
|
|
[EXPECTED OUTPUT] |
|
{{ |
|
'puppy_colors': ['golden'], |
|
'puppy_posture': 'sitting', |
|
'puppy_expression': 'smiling', |
|
'puppy_location': 'on a green field with orange flowers', |
|
'puppy_background': 'green field with orange flowers' |
|
}} |
|
|
|
``` |
|
### Direct Use |
|
|
|
Just send an image and put [FUNCTION CALL] in the text. Can also be used for normal qwenvl inference. |
|
|
|
### Recommendations |
|
|
|
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
|
(recommended) transformers >= 4.32.0 |
|
|
|
## How to Get Started with the Model |
|
``` |
|
query = tokenizer.from_list_format([ |
|
{'image': 'https://images.rawpixel.com/image_800/cHJpdmF0ZS9sci9pbWFnZXMvd2Vic2l0ZS8yMDIzLTA4L3Jhd3BpeGVsX29mZmljZV8xNV9waG90b19vZl9hX2RvZ19ydW5uaW5nX3dpdGhfb3duZXJfYXRfcGFya19lcF9mM2I3MDQyZC0zNWJlLTRlMTQtOGZhNy1kY2Q2OWQ1YzQzZjlfMi5qcGc.jpg'}, # Either a local path or an url |
|
{'text': "[FUNCTION CALL]"}, |
|
]) |
|
``` |
|
## Training Details |
|
|
|
### Training Data |
|
|
|
https://huggingface.co/datasets/AgoraX/OpenImage-FNCall-50k |
|
|
|
### Training Procedure |
|
|
|
qlora for 1 epoch, 1000 steps |
|
|
|
### Framework versions |
|
|
|
- PEFT 0.7.1 |