--- tags: - stable-diffusion - stable-diffusion-diffusers - text-to-image datasets: - yashvoladoddi37/kanjienglish language: - en - ja library_name: diffusers pipeline_tag: text-to-image --- # Kanji Diffusion v1-4 Model Card Kanji Diffusion is a latent text-to-image diffusion model capable of hallucinating Kanji characters given any English prompt. ## Fine-tuned Model Details - **Developed by:** Yashpreet Voladoddi - **Model type:** Diffusion-based text-to-image generation model, fine-tuned on Stable Diffusion v1.4 model. ### Colab In order to run the pipeline and see how my model generates the kanji characters, follow the code flow below on Colab(on T4 GPU runtime, else it takes a long time to infer each image). Make sure you have your Huggingface API KEY / ACCESS TOKEN for this. ```python import os from google.colab import drive drive.mount('/content/drive') os.chdir("/content/drive/MyDrive") !pip install diffusers !git clone https://github.com/huggingface/diffusers !huggingface-cli login from diffusers import StableDiffusionPipeline import torch torch.cuda.empty_cache() model_path = "yashvoladoddi37/kanji-diffusion-v1-4" pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16, use_safetensors = True).to("cuda") pipe.unet.load_attn_procs(model_path) pipe.to("cuda") prompt = "A Kanji meaning baby robot" image = pipe(prompt).images[0] image.save("baby-robot-kanji-v1-4.png") ``` ### Limitations ## Training **Training Data** **Hardware:** Nvidia GTX 1650 4GB vRAM | 8GB RAM and T4 GPU on Colab **Training Script:** ```python !accelerate launch train_text_to_image_lora.py \ --pretrained_model_name_or_path="CompVis/stable-diffusion-v1-4" \ --dataset_name="yashvoladoddi37/kanjienglish" \ --image_column = "image" --caption_column="text" \ --resolution=512 \ --random_flip \ --train_batch_size=1 \ --num_train_epochs=1 \ --checkpointing_steps=500 \ --learning_rate=1e-04 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --seed=42 \ --output_dir="kanji-diffusion-v1-4" \ --validation_prompt="A kanji meaning Elon Musk" \ --push_to_hub ```