codeShare commited on
Commit
2b7496e
1 Parent(s): 551a5a7

Upload sd_token_similarity_calculator.ipynb

Browse files
sd_token_similarity_calculator.ipynb CHANGED
@@ -123,7 +123,7 @@
123
  },
124
  "outputId": "e335f5da-b26d-4eea-f854-fd646444ea14"
125
  },
126
- "execution_count": 15,
127
  "outputs": [
128
  {
129
  "output_type": "stream",
@@ -279,6 +279,48 @@
279
  },
280
  "execution_count": null,
281
  "outputs": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
282
  }
283
  ]
284
  }
 
123
  },
124
  "outputId": "e335f5da-b26d-4eea-f854-fd646444ea14"
125
  },
126
+ "execution_count": null,
127
  "outputs": [
128
  {
129
  "output_type": "stream",
 
279
  },
280
  "execution_count": null,
281
  "outputs": []
282
+ },
283
+ {
284
+ "cell_type": "markdown",
285
+ "source": [
286
+ "\n",
287
+ "\n",
288
+ "This is how the notebook works:\n",
289
+ "\n",
290
+ "Similiar vectors = similiar output in the SD 1.5 / SDXL / FLUX model\n",
291
+ "\n",
292
+ "CLIP converts the prompt text to vectors (“tensors”) , with float32 values usually ranging from -1 to 1\n",
293
+ "\n",
294
+ "Dimensions are [ 1x768 ] tensors for SD 1.5 , and a [ 1x768 , 1x1024 ] tensor for SDXL and FLUX.\n",
295
+ "\n",
296
+ "The SD models and FLUX converts these vectors to an image.\n",
297
+ "\n",
298
+ "This notebook takes an input string , tokenizes it and matches the first token against the 49407 token vectors in the vocab.json : https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main/tokenizer\n",
299
+ "\n",
300
+ "It finds the “most similiar tokens” in the list. Similarity is the theta angle between the token vectors.\n",
301
+ "\n",
302
+ "\n",
303
+ "<div>\n",
304
+ "<img src=\"https://huggingface.co/datasets/codeShare/sd_tokens/resolve/main/cosine.jpeg\" width=\"300\"/>\n",
305
+ "</div>\n",
306
+ "\n",
307
+ "The angle is calculated using cosine similarity , where 1 = 100% similarity (parallell vectors) , and 0 = 0% similarity (perpendicular vectors).\n",
308
+ "\n",
309
+ "Negative similarity is also possible.\n",
310
+ "\n",
311
+ "So if you are bored of prompting “girl” and want something similiar you can run this notebook and use the “chick</w>” token at 21.88% similarity , for example\n",
312
+ "\n",
313
+ "You can also run a mixed search , like “cute+girl”/2 , where for example “kpop</w>” has a 16.71% similarity\n",
314
+ "\n",
315
+ "Sidenote: Prompt weights like (banana:1.2) will scale the magnitude of the corresponding 1x768 tensor(s) by 1.2 .\n",
316
+ "\n",
317
+ "Source: https://huggingface.co/docs/diffusers/main/en/using-diffusers/weighted_prompts*\n",
318
+ "\n",
319
+ "So TLDR; vector direction = “what to generate” , vector magnitude = “prompt weights”"
320
+ ],
321
+ "metadata": {
322
+ "id": "njeJx_nSSA8H"
323
+ }
324
  }
325
  ]
326
  }