File size: 2,203 Bytes
f6fc41b dcaaff0 f6fc41b dcaaff0 f8972c5 dcaaff0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
license: mit
---
# Prompt2MedImage - Diffusion for Medical Images
Prompt2MedImage is a latent text to image diffusion model that has been fine-tuned on medical images from ROCO dataset.
The weights here are itended to be used with the 🧨Diffusers library.
This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container.
## Model Details
- **Developed by:** Nihir Chadderwala
- **Model type:** Diffusion based text to medical image generation model
- **Language:** English
- **License:** wtfpl
- **Model Description:** This latent text to image diffusion model can be used to generate high quality medical images based on text prompts. It uses a fixed, pretrained text encoder ([CLIP ViT-L/14](https://arxiv.org/abs/2103.00020)) as suggested in the [Imagen paper](https://arxiv.org/abs/2205.11487).
## License
This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage.
- You can't use the model to deliberately produce nor share illegal or harmful outputs or content.
- The author claims no rights on the outputs you generate, you are free to use them and are accountable for their use.
- You may re-distribute the weights and use the model commercially and/or as a service.
## Run using PyTorch
```bash
pip install diffusers transformers
```
Running pipeline with default PNDM scheduler:
```python
import torch
from diffusers import StableDiffusionPipeline
model_id = "Prompt2MedImage"
device = "cuda"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)
prompt = "Showing the subtrochanteric fracture in the porotic bone."
image = pipe(prompt).images[0]
image.save("porotic_bone_fracture.png")
```
## Citation
```
O. Pelka, S. Koitka, J. Rückert, F. Nensa, C.M. Friedrich,
"Radiology Objects in COntext (ROCO): A Multimodal Image Dataset".
MICCAI Workshop on Large-scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180-189, Springer Cham, 2018.
doi: 10.1007/978-3-030-01364-6_20
```
|