metadata
license: mit
Prompt2MedImage - Diffusion for Medical Images
Prompt2MedImage is a latent text to image diffusion model that has been fine-tuned on medical images from ROCO dataset.
The weights here are itended to be used with the 🧨Diffusers library.
This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container.
Model Details
- Developed by: Nihir Chadderwala
- Model type: Diffusion based text to medical image generation model
- Language: English
- License: wtfpl
- Model Description: This latent text to image diffusion model can be used to generate high quality medical images based on text prompts. It uses a fixed, pretrained text encoder (CLIP ViT-L/14) as suggested in the Imagen paper.
License
This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage.
- You can't use the model to deliberately produce nor share illegal or harmful outputs or content.
- The author claims no rights on the outputs you generate, you are free to use them and are accountable for their use.
- You may re-distribute the weights and use the model commercially and/or as a service.
Run using PyTorch
pip install diffusers transformers
Running pipeline with default PNDM scheduler:
import torch
from diffusers import StableDiffusionPipeline
model_id = "Nihirc/Prompt2MedImage"
device = "cuda"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)
prompt = "Showing the subtrochanteric fracture in the porotic bone."
image = pipe(prompt).images[0]
image.save("porotic_bone_fracture.png")
Citation
O. Pelka, S. Koitka, J. Rückert, F. Nensa, C.M. Friedrich,
"Radiology Objects in COntext (ROCO): A Multimodal Image Dataset".
MICCAI Workshop on Large-scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180-189, Springer Cham, 2018.
doi: 10.1007/978-3-030-01364-6_20