mohammad-shirkhani commited on
Commit
2b0afb0
1 Parent(s): 0f03fc1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -60
README.md CHANGED
@@ -1,73 +1,41 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- language:
4
- - fa
5
- - en
6
- library_name: transformers
7
- tags:
8
- - text-to-image
9
- - stable-diffusion
10
- - transformers
11
- pipeline_tag: text-to-image
12
- co2_eq_emissions:
13
- emissions: 200000
14
- ---
15
 
16
- <p align="center">
17
- <img src="PersianToImage.jpg" alt="PersianToImage logo" width=200/>
18
- </p>
19
 
20
- # <span style="font-variant:small-caps;">PersianToImage</span>
21
 
22
- <span style="font-variant:small-caps;">PersianToImage</span> is a unique pipeline that translates Persian text to English and generates corresponding images using a fine-tuned Stable Diffusion model. This tool is designed to easily convert descriptive Persian text into high-quality images, making it ideal for creative projects, visual content generation, and more.
23
 
24
- ## Model Description
 
 
 
25
 
26
- - **Developed by:** [Your Name](mailto:your.email@example.com)
27
- - **Model type:** Text-to-Image generation
28
- - **Languages:** Persian and English
29
- - **License:** [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) (non-commercial use only)
30
 
31
- ## How to Get Started with the Model
32
 
33
- Use the code below to get started with the model.
34
- Make sure you have installed <code><b>torch</b></code>, <code><b>diffusers</b></code>, <code><b>transformers</b></code>, and <code><b>accelerate</b></code> libraries.
35
 
36
- ```python
37
- import torch
38
- from transformers import MT5ForConditionalGeneration, T5Tokenizer
39
- from diffusers import StableDiffusionPipeline
40
-
41
- device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
42
-
43
- class PersianToImageModel:
44
- def __init__(self, translation_model_name, image_model_name, device):
45
- self.device = device
46
-
47
- self.translation_model = MT5ForConditionalGeneration.from_pretrained(translation_model_name).to(device)
48
- self.translation_tokenizer = T5Tokenizer.from_pretrained(translation_model_name)
49
 
50
- self.image_model = StableDiffusionPipeline.from_pretrained(image_model_name).to(device)
51
 
52
- def translate_text(self, persian_text):
53
- input_ids = self.translation_tokenizer.encode(persian_text, return_tensors="pt").to(self.device)
54
- translated_ids = self.translation_model.generate(input_ids, max_length=512, num_beams=4, early_stopping=True)
55
- translated_text = self.translation_tokenizer.decode(translated_ids[0], skip_special_tokens=True)
56
- return translated_text
57
-
58
- def generate_image(self, english_text):
59
- image = self.image_model(english_text).images[0]
60
- return image
61
-
62
- def __call__(self, persian_text):
63
- english_text = self.translate_text(persian_text)
64
- print(f"Translated Text: {english_text}")
65
-
66
- return self.generate_image(english_text)
67
 
68
- # Instantiate the model
69
- translation_model_name = 'mohammad-shirkhani/finetune_persian_to_english_mt5_base_summarize_on_celeba_hq'
70
- image_model_name = 'ebrahim-k/Stable-Diffusion-1_5-FT-celeba_HQ_en'
71
 
72
- persian_to_image_model = PersianToImageModel(translation_model_name, image_model_name, device)
 
 
73
 
 
 
 
 
 
1
+ # Persian-to-Image Text-to-Image Pipeline
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
+ ## Model Overview
 
 
4
 
5
+ This model pipeline is designed to generate images from Persian text descriptions by translating the Persian text into English and then using a fine-tuned Stable Diffusion model to generate the corresponding image. The pipeline combines two models: a translation model (`mohammad-shirkhani/finetune_persian_to_english_mt5_base_summarize_on_celeba_hq`) and an image generation model (`ebrahim-k/Stable-Diffusion-1_5-FT-celeba_HQ_en`).
6
 
7
+ ## Model Details
8
 
9
+ ### Translation Model
10
+ - **Model Name**: `mohammad-shirkhani/finetune_persian_to_english_mt5_base_summarize_on_celeba_hq`
11
+ - **Architecture**: mT5
12
+ - **Purpose**: This model is used to translate Persian text into English. It has been fine-tuned specifically on the CelebA-HQ dataset for summarization tasks, making it well-suited for translating descriptions of facial features.
13
 
14
+ ### Image Generation Model
15
+ - **Model Name**: `ebrahim-k/Stable-Diffusion-1_5-FT-celeba_HQ_en`
16
+ - **Architecture**: Stable Diffusion 1.5
17
+ - **Purpose**: This model generates high-quality images from the English text produced by the translation model. It has been fine-tuned on the CelebA-HQ dataset, making it particularly effective for generating realistic human faces based on text descriptions.
18
 
19
+ ## Pipeline Description
20
 
21
+ The pipeline works as follows:
 
22
 
23
+ 1. **Text Translation**: The Persian input text is translated into English using the mT5-based translation model.
24
+ 2. **Image Generation**: The translated English text is then fed into the Stable Diffusion model to generate the corresponding image.
 
 
 
 
 
 
 
 
 
 
 
25
 
26
+ ### Example Usage
27
 
28
+ ```python
29
+ from IPython.display import display
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
+ # Persian text describing a person
32
+ persian_text = "این زن دارای موهای موج دار ، لب های بزرگ و موهای قهوه ای است و رژ لب دارد.این زن موهای موج دار و لب های بزرگ دارد و رژ لب دارد.فرد جذاب است و موهای موج دار ، چشم های باریک و موهای قهوه ای دارد."
 
33
 
34
+ # Generate and display the image
35
+ image = persian_to_image_model(persian_text)
36
+ display(image)
37
 
38
+ # Another example
39
+ persian_text2 = "این مرد جذاب دارای موهای قهوه ای ، سوزش های جانبی ، دهان کمی باز و کیسه های زیر چشم است.این فرد جذاب دارای کیسه های زیر چشم ، سوزش های جانبی و دهان کمی باز است."
40
+ image2 = persian_to_image_model(persian_text2)
41
+ display(image2)