Kanji Diffusion v1-4 Model Card

Kanji Diffusion is a latent text-to-image diffusion model capable of hallucinating Kanji characters given any English prompt.

Fine-tuned Model Details

  • Developed by: Yashpreet Voladoddi
  • Model type: Diffusion-based text-to-image generation model, fine-tuned on Stable Diffusion v1.4 model.

Colab

In order to run the pipeline and see how my model generates the kanji characters, follow the code flow below on Colab(on T4 GPU runtime, else it takes a long time to infer each image). Make sure you have your Huggingface API KEY / ACCESS TOKEN for this.

import os
from google.colab import drive
drive.mount('/content/drive')
os.chdir("/content/drive/MyDrive")

!pip install diffusers
!git clone https://github.com/huggingface/diffusers
!huggingface-cli login

from diffusers import StableDiffusionPipeline
import torch
torch.cuda.empty_cache()

model_path = "yashvoladoddi37/kanji-diffusion-v1-4" 
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16, use_safetensors = True).to("cuda")
pipe.unet.load_attn_procs(model_path)
pipe.to("cuda")

prompt = "A Kanji meaning baby robot"
image = pipe(prompt).images[0]
image.save("baby-robot-kanji-v1-4.png")

Limitations

Training

Training Data

Hardware: Nvidia GTX 1650 4GB vRAM | 8GB RAM and T4 GPU on Colab

Training Script:

      !accelerate launch train_text_to_image_lora.py \
        --pretrained_model_name_or_path="CompVis/stable-diffusion-v1-4" \
        --dataset_name="yashvoladoddi37/kanjienglish" \
        --image_column = "image"
        --caption_column="text" \
        --resolution=512 \
        --random_flip \
        --train_batch_size=1 \
        --num_train_epochs=1 \
        --checkpointing_steps=500 \
        --learning_rate=1e-04 \
        --lr_scheduler="constant" \
        --lr_warmup_steps=0 \
        --seed=42 \
        --output_dir="kanji-diffusion-v1-4" \
        --validation_prompt="A kanji meaning Elon Musk" \
        --push_to_hub
Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train yashvoladoddi37/kanji-diffusion-v1-4