Upload folder using huggingface_hub

Browse files

Files changed (8) hide show

README.md +18 -14
safety_checker/model.safetensors +3 -0
sd-v1-5-inpainting.fp16.ckpt +3 -0
sd-v1-5-inpainting.fp16.safetensors +3 -0
sd-v1-5-inpainting.safetensors +3 -0
text_encoder/model.safetensors +3 -0
unet/diffusion_pytorch_model.safetensors +3 -0
vae/diffusion_pytorch_model.safetensors +3 -0

README.md CHANGED Viewed

@@ -22,15 +22,25 @@ extra_gated_fields:
  I have read the License and agree with its terms: checkbox
 ---
 Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask.
 The **Stable-Diffusion-Inpainting** was initialized with the weights of the [Stable-Diffusion-v-1-2](https://steps/huggingface.co/CompVis/stable-diffusion-v-1-2-original). First 595k steps regular training, then 440k steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning to improve classifier-free [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598). For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything.
-[![Open In Spaces](https://camo.githubusercontent.com/00380c35e60d6b04be65d3d94a58332be5cc93779f630bcdfc18ab9a3a7d3388/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f25463025394625413425393725323048756767696e67253230466163652d5370616365732d626c7565)](https://huggingface.co/spaces/runwayml/stable-diffusion-inpainting) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/in_painting_with_stable_diffusion_using_diffusers.ipynb)
- :-------------------------:|:-------------------------:|
 ## Examples:
-You can use this both with the [🧨Diffusers library](https://github.com/huggingface/diffusers) and the [RunwayML GitHub repository](https://github.com/runwayml/stable-diffusion).
 ### Diffusers
@@ -38,8 +48,8 @@ You can use this both with the [🧨Diffusers library](https://github.com/huggin
 from diffusers import StableDiffusionInpaintPipeline
 pipe = StableDiffusionInpaintPipeline.from_pretrained(
-    "runwayml/stable-diffusion-inpainting",
-    revision="fp16",
     torch_dtype=torch.float16,
 )
 prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
@@ -59,18 +69,13 @@ image.save("./yellow_cat_on_park_bench.png")
 :-------------------------:|:-------------------------:|
 <span style="position: relative;bottom: 150px;">Face of a yellow cat, high resolution, sitting on a park bench</span> | <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/test.png" alt="drawing" width="300"/>
-### Original GitHub Repository
-1. Download the weights [sd-v1-5-inpainting.ckpt](https://huggingface.co/runwayml/stable-diffusion-inpainting/resolve/main/sd-v1-5-inpainting.ckpt)
-2. Follow instructions [here](https://github.com/runwayml/stable-diffusion#inpainting-with-stable-diffusion).
 ## Model Details
 - **Developed by:** Robin Rombach, Patrick Esser
 - **Model type:** Diffusion-based text-to-image generation model
 - **Language(s):** English
 - **License:** [The CreativeML OpenRAIL M license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) is an [Open RAIL M license](https://www.licenses.ai/blog/2022/8/18/naming-convention-of-responsible-ai-licenses), adapted from the work that [BigScience](https://bigscience.huggingface.co/) and [the RAIL Initiative](https://www.licenses.ai/) are jointly carrying in the area of responsible AI licensing. See also [the article about the BLOOM Open RAIL license](https://bigscience.huggingface.co/blog/the-bigscience-rail-license) on which our license is based.
 - **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses a fixed, pretrained text encoder ([CLIP ViT-L/14](https://arxiv.org/abs/2103.00020)) as suggested in the [Imagen paper](https://arxiv.org/abs/2205.11487).
-- **Resources for more information:** [GitHub Repository](https://github.com/runwayml/stable-diffusion), [Paper](https://arxiv.org/abs/2112.10752).
 - **Cite as:**
       @InProceedings{Rombach_2022_CVPR,
@@ -99,10 +104,11 @@ Excluded uses are described below.
  ### Misuse, Malicious Use, and Out-of-Scope Use
 _Note: This section is taken from the [DALLE-MINI model card](https://huggingface.co/dalle-mini/dalle-mini), but applies in the same way to Stable Diffusion v1_.
 The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.
 #### Out-of-Scope Use
 The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.
 #### Misuse and Malicious Use
 Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to:
@@ -140,7 +146,6 @@ Texts and images from communities and cultures that use other languages are like
 This affects the overall output of the model, as white and western cultures are often set as the default. Further, the
 ability of the model to generate content with non-English prompts is significantly worse than with English-language prompts.
 ## Training
 **Training Data**
@@ -209,7 +214,6 @@ Based on that information, we estimate the following CO2 emissions using the [Ma
 - **Compute Region:** US-east
 - **Carbon Emitted (Power consumption x Time x Carbon produced based on location of power grid):** 11250 kg CO2 eq.
 ## Citation
 ```bibtex

  I have read the License and agree with its terms: checkbox
 ---
+# Re-upload
+This repository is being re-uploaded to HuggingFace in accordance with [The CreativeML OpenRAIL-M License](https://huggingface.co/spaces/CompVis/stable-diffusion-license) under which this repository was originally uploaded, specifically **Section II** which grants:
+> ...a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare, publicly display, publicly perform, sublicense, and distribute the Complementary Material, the Model, and Derivatives of the Model.
+Note that these files did not come from HuggingFace, but instead from [modelscope](https://www.modelscope.cn/models/AI-ModelScope/stable-diffusion-inpainting/files). As such, some files that were present in the original repository may not be present. File integrity has been verified via checksum.
+# Original Model Card
 Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask.
 The **Stable-Diffusion-Inpainting** was initialized with the weights of the [Stable-Diffusion-v-1-2](https://steps/huggingface.co/CompVis/stable-diffusion-v-1-2-original). First 595k steps regular training, then 440k steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning to improve classifier-free [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598). For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything.
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/in_painting_with_stable_diffusion_using_diffusers.ipynb)
 ## Examples:
+You can use this with the [🧨Diffusers library](https://github.com/huggingface/diffusers).
 ### Diffusers
 from diffusers import StableDiffusionInpaintPipeline
 pipe = StableDiffusionInpaintPipeline.from_pretrained(
+    "benjamin-paine/stable-diffusion-v1-5-inpainting",
+    variant="fp16",
     torch_dtype=torch.float16,
 )
 prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
 :-------------------------:|:-------------------------:|
 <span style="position: relative;bottom: 150px;">Face of a yellow cat, high resolution, sitting on a park bench</span> | <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/test.png" alt="drawing" width="300"/>
 ## Model Details
 - **Developed by:** Robin Rombach, Patrick Esser
 - **Model type:** Diffusion-based text-to-image generation model
 - **Language(s):** English
 - **License:** [The CreativeML OpenRAIL M license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) is an [Open RAIL M license](https://www.licenses.ai/blog/2022/8/18/naming-convention-of-responsible-ai-licenses), adapted from the work that [BigScience](https://bigscience.huggingface.co/) and [the RAIL Initiative](https://www.licenses.ai/) are jointly carrying in the area of responsible AI licensing. See also [the article about the BLOOM Open RAIL license](https://bigscience.huggingface.co/blog/the-bigscience-rail-license) on which our license is based.
 - **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses a fixed, pretrained text encoder ([CLIP ViT-L/14](https://arxiv.org/abs/2103.00020)) as suggested in the [Imagen paper](https://arxiv.org/abs/2205.11487).
+- **Resources for more information:** [Paper](https://arxiv.org/abs/2112.10752).
 - **Cite as:**
       @InProceedings{Rombach_2022_CVPR,
  ### Misuse, Malicious Use, and Out-of-Scope Use
 _Note: This section is taken from the [DALLE-MINI model card](https://huggingface.co/dalle-mini/dalle-mini), but applies in the same way to Stable Diffusion v1_.
 The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.
 #### Out-of-Scope Use
 The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.
 #### Misuse and Malicious Use
 Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to:
 This affects the overall output of the model, as white and western cultures are often set as the default. Further, the
 ability of the model to generate content with non-English prompts is significantly worse than with English-language prompts.
 ## Training
 **Training Data**
 - **Compute Region:** US-east
 - **Carbon Emitted (Power consumption x Time x Carbon produced based on location of power grid):** 11250 kg CO2 eq.
 ## Citation
 ```bibtex

safety_checker/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:26bedfc237ed91ea9661957a1201f1a9dec54da7fd92f37cb2e9bff4b73ab639
+size 1215981800

sd-v1-5-inpainting.fp16.ckpt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3384ef4638923d8cfe838d73d280dfbabc521ab751e8b74f8a0cba8a91d512fd
+size 4265456609

sd-v1-5-inpainting.fp16.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1a33284f5a9be288d1d97c4b1d66d186b1eda8d3703506318e3358bf05914cee
+size 2132692100

sd-v1-5-inpainting.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ef97ac1fe87ed0406433ad8710ff1da6e07e873de9a1a107b828844336d015ec
+size 4265216468

text_encoder/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7e7b5cc5d68991c50f042031c04547e49f006e468d521d064257a7f465c2910b
+size 492265848

unet/diffusion_pytorch_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3c5e5441a3304b5fe1eb1b29279889440f0ebdbf969a078b191c6c7046a2cc7f
+size 3438225120

vae/diffusion_pytorch_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:37af64f891b66af7e83ce2dc8490e2fcaff9b4e681a725fb7a1376d8826bdeb2
+size 334643252