Spaces:
Running
on
Zero
Running
on
Zero
Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,82 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
app_file: app.py
|
4 |
sdk: gradio
|
5 |
sdk_version: 4.44.1
|
6 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
title: Unboxing SDXL with SAEs
|
3 |
app_file: app.py
|
4 |
sdk: gradio
|
5 |
sdk_version: 4.44.1
|
6 |
---
|
7 |
+
|
8 |
+
# Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders
|
9 |
+
|
10 |
+
![modification demostration](resourses/image.png)
|
11 |
+
|
12 |
+
This repository contains code to reproduce results from our paper (https://arxiv.org/abs/2410.22366) on using sparse autoencoders (SAEs) to analyze and interpret the internal representations of text-to-image diffusion models, specifically SDXL Turbo.
|
13 |
+
|
14 |
+
## Repository Structure
|
15 |
+
|
16 |
+
```
|
17 |
+
|-- SAE/ # Core sparse autoencoder implementation
|
18 |
+
|-- SDLens/ # Tools for analyzing diffusion models
|
19 |
+
| `-- hooked_sd_pipeline.py # Modified stable diffusion pipeline
|
20 |
+
|-- scripts/
|
21 |
+
| |-- collect_latents_dataset.py # Generate training data
|
22 |
+
| `-- train_sae.py # Train SAE models
|
23 |
+
|-- utils/
|
24 |
+
| `-- hooks.py # Hook utility functions
|
25 |
+
|-- checkpoints/ # Pretrained SAE model checkpoints
|
26 |
+
|-- app.py # Demo application
|
27 |
+
|-- app.ipynb # Interactive notebook demo
|
28 |
+
|-- example.ipynb # Usage examples
|
29 |
+
`-- requirements.txt # Python dependencies
|
30 |
+
```
|
31 |
+
|
32 |
+
## Installation
|
33 |
+
|
34 |
+
```bash
|
35 |
+
pip install -r requirements.txt
|
36 |
+
```
|
37 |
+
|
38 |
+
## Demo Application
|
39 |
+
|
40 |
+
You can try our gradio demo application (`app.ipynb`) to browse and experiment with 20K+ features of our trained SAEs out-of-the-box. You can find the same notebook on [Google Colab](https://colab.research.google.com/drive/1Sd-g3w2Fwv7pc_fxgeQOR3S_RKr18qMP?usp=sharing).
|
41 |
+
|
42 |
+
## Usage
|
43 |
+
|
44 |
+
1. Collect latent data from SDXL Turbo:
|
45 |
+
```bash
|
46 |
+
python scripts/collect_latents_dataset.py --save_path={your_save_path}
|
47 |
+
```
|
48 |
+
|
49 |
+
2. Train sparse autoencoders:
|
50 |
+
|
51 |
+
2.1. Insert the path of stored latents and directory to store checkpoints in `SAE/config.json`
|
52 |
+
|
53 |
+
2.2. Run the training script:
|
54 |
+
|
55 |
+
```bash
|
56 |
+
python scripts/train_sae.py
|
57 |
+
```
|
58 |
+
|
59 |
+
## Pretrained Models
|
60 |
+
|
61 |
+
We provide pretrained SAE checkpoints for 4 key transformer blocks in SDXL Turbo's U-Net. See `example.ipynb` for analysis examples and visualization of learned features.
|
62 |
+
|
63 |
+
|
64 |
+
## Citation
|
65 |
+
|
66 |
+
If you find this code useful in your research, please cite our paper:
|
67 |
+
|
68 |
+
```bibtex
|
69 |
+
@misc{surkov2024unpackingsdxlturbointerpreting,
|
70 |
+
title={Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders},
|
71 |
+
author={Viacheslav Surkov and Chris Wendler and Mikhail Terekhov and Justin Deschenaux and Robert West and Caglar Gulcehre},
|
72 |
+
year={2024},
|
73 |
+
eprint={2410.22366},
|
74 |
+
archivePrefix={arXiv},
|
75 |
+
primaryClass={cs.LG},
|
76 |
+
url={https://arxiv.org/abs/2410.22366},
|
77 |
+
}
|
78 |
+
```
|
79 |
+
|
80 |
+
## Acknowledgements
|
81 |
+
|
82 |
+
The SAE component was implemented based on [`openai/sparse_autoencoder`](https://github.com/openai/sparse_autoencoder) repository.
|