{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Tutorial : Running minimal inference examples with diffuser.\n", "\n", "For this tutorial, we will use my pre-trained lora embedding that is pretrained on pop-arts, illustrations and pixar footages.\n", "\n", "To get started install this package with:\n", "\n", "```bash\n", "pip install git+https://github.com/cloneofsimo/lora.git\n", "```\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9acf730c21a8494084225879c82b34cc", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Fetching 12 files: 0%| | 0/12 [00:00" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from diffusers import StableDiffusionPipeline\n", "import torch\n", "\n", "model_id = \"stabilityai/stable-diffusion-2-1-base\"\n", "\n", "pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to(\n", " \"cuda\"\n", ")\n", "\n", "prompt = \"style of sks, baby lion\"\n", "torch.manual_seed(1)\n", "image = pipe(prompt, num_inference_steps=50, guidance_scale=7).images[0]\n", "\n", "image # nice. diffusers are cool.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now there is two way to LORA this model. You can 1. monkey-patch it, or 2. update the weight inplaced.\n", "\n", "Monkey-patching is essentially replacing the linear layer with a lora-linear layer, which is the following\n", "\n", "$$\n", "x_2 = Wx_1 + A B^T x_1\n", "$$\n", "\n", "On the other hand, weight updating is literally replacing the original weight with the LORA weight. This is the following\n", "\n", "$$\n", "W' = W + A B^T\n", "$$\n", "\n", "You might find this weird. Just having the weight updated is the logical option. Why even monkey-patch when you can add the weights? Well, by keeping the LORA weights we can perform _weight mixing_ dynamically. We can't do this if we just update the weight, because the weight is fixed. This is the reason why we have two options. You can adjust the weight with `tune_lora_scale` function.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "84a231ed02dc4dbb90ae3aa79b891bd0", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/51 [00:00" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from lora_diffusion import monkeypatch_lora, tune_lora_scale\n", "\n", "\n", "monkeypatch_lora(pipe.unet, torch.load(\"../lora_disney.pt\"))\n", "tune_lora_scale(pipe.unet, 1.00)\n", "\n", "torch.manual_seed(1)\n", "image = pipe(prompt, num_inference_steps=50, guidance_scale=7).images[0]\n", "image.save(\"../contents/disney_lora.jpg\")\n", "image\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "85c793118ee745269c15cbc20460e8c1", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/51 [00:00" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "torch.manual_seed(1)\n", "tune_lora_scale(pipe.unet, 0.5)\n", "image = pipe(prompt, num_inference_steps=50, guidance_scale=7).images[0]\n", "image\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nice. Let's try another example:\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2301a312e06a406bb95443dc26902763", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Fetching 12 files: 0%| | 0/12 [00:00" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to(\n", " \"cuda\"\n", ")\n", "\n", "prompt = \"style of sks, superman\"\n", "torch.manual_seed(1)\n", "image = pipe(prompt, num_inference_steps=50, guidance_scale=7).images[0]\n", "\n", "image\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fa3b9d9ce39a4d59bfd808dc96150d28", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/51 [00:00" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "monkeypatch_lora(pipe.unet, torch.load(\"../lora_pop.pt\"))\n", "torch.manual_seed(1)\n", "tune_lora_scale(pipe.unet, 1.00)\n", "image = pipe(prompt, num_inference_steps=50, guidance_scale=7).images[0]\n", "image\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ouch, that must have been overfitting. Let's try to lower the alpha to have less effect : This is the result of alpha = 0.7.\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a98135b981404414aa695a776e0ea0a2", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/51 [00:00" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "torch.manual_seed(1)\n", "tune_lora_scale(pipe.unet, 0.7)\n", "\n", "image = pipe(prompt, num_inference_steps=50, guidance_scale=7).images[0]\n", "image.save(\"../contents/pop_art.jpg\")\n", "image\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3.8.12 ('pytorch_latest')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" }, "orig_nbformat": 4, "vscode": { "interpreter": { "hash": "952e1bebe1b278d85469a034aefc1854b777c1b518feedf8249123f6f86cec05" } } }, "nbformat": 4, "nbformat_minor": 2 }