LeroyDyer
/

SpydazWeb_VisonEncoderDecoder_Project

Image-Text-to-Text

vision-encoder-decoder

text-generation

image-text-to-image-text

Inference Endpoints

Model card Files Files and versions Community

LeroyDyer commited on Apr 8

Commit

3fa67c7

•

1 Parent(s): 89b4580

Update README.md

Files changed (1) hide show

README.md +28 -1

README.md CHANGED Viewed

@@ -69,7 +69,34 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 ## How to Get Started with the Model
-Use the code below to get started with the model.
 [More Information Needed]

 ## How to Get Started with the Model
+```python
+from transformers import AutoProcessor, VisionEncoderDecoderModel
+import requests
+from PIL import Image
+import torch
+processor = AutoProcessor.from_pretrained("LeroyDyer/Mixtral_AI_Cyber_Q_Vision")
+model = VisionEncoderDecoderModel.from_pretrained("LeroyDyer/Mixtral_AI_Cyber_Q_Vision")
+# load image from the IAM dataset
+url = "https://fki.tic.heia-fr.ch/static/img/a01-122-02.jpg"
+image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
+# training
+model.config.decoder_start_token_id = processor.tokenizer.eos_token_id
+model.config.pad_token_id = processor.tokenizer.pad_token_id
+model.config.vocab_size = model.config.decoder.vocab_size
+pixel_values = processor(image, return_tensors="pt").pixel_values
+text = "hello world"
+labels = processor.tokenizer(text, return_tensors="pt").input_ids
+outputs = model(pixel_values=pixel_values, labels=labels)
+loss = outputs.loss
+# inference (generation)
+generated_ids = model.generate(pixel_values)
+generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
+```
 [More Information Needed]