readout-guidance / README.md
g-luo's picture
Create initial commit
b5744c1
|
raw
history blame
1.82 kB
metadata
license: apache-2.0

This repository stores the pre-trained weights for Readout Guidance: Learning Control from Diffusion Features. The code implementation of our method can be found at https://github.com/g-luo/readout-guidance.

Readout Head Weights

The weights/ folder contains the pre-trained weights of the readout heads, named according to the following convention:

readout_<base-model>_<task-type>_<head-type>

Spatially Aligned Control

  • readout_sdxl_spatial_pose.pt
  • readout_sdv15_spatial_pose.pt
    • Readout head trained with OpenPose pose skeletons as supervision on PascalVOC images, filtered only to those containing people.
  • readout_sdxl_spatial_depth.pt
  • readout_sdv15_spatial_depth.pt
    • Readout head trained with MiDaS depth maps as supervision on PascalVOC images.
  • readout_sdxl_spatial_edge.pt
  • readout_sdv15_spatial_edge.pt
    • Readout head trained with HED edge detections as supervision on PascalVOC images.

Drag-Based Manipulation

  • readout_sdxl_drag_correspondence.pt
  • readout_sdv15_drag_correspondence.pt
    • Readout head trained with a contrastive loss with CoTracker point tracks across pairs of DAVIS video frames.
  • readout_sdxl_drag_appearance.pt
  • readout_sdv15_drag_appearance.pt
    • Readout head trained with a triplet loss with real frames as positives and SDEdit-ed frames as negatives derived from DAVIS videos.