AutoTrain documentation

DreamBooth

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.8.24).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

DreamBooth

DreamBooth is an innovative method that allows for the customization of text-to-image models like Stable Diffusion using just a few images of a subject. DreamBooth enables the generation of new, contextually varied images of the subject in a range of scenes, poses, and viewpoints, expanding the creative possibilities of generative models.

Data Preparation

The data format for DreamBooth training is simple. All you need is images of a concept (e.g. a person) and a concept token.

Step 1: Gather Your Images

Collect 3-5 high-quality images of the subject you wish to personalize. These images should vary slightly in pose or background to provide the model with a diverse learning set. You can select more images if you want to train a more robust model.

Step 2: Select Your Model

Choose a base model from the Hugging Face Hub that is compatible with your needs. It’s essential to select a model that supports the image size of your training data. Models available on the hub often have specific requirements or capabilities, so ensure the model you choose can accommodate the dimensions of your images.

Step 3: Define Your Concept Token

The concept token is a crucial element in DreamBooth training. This token acts as a unique identifier for your subject within the model. Typically, you will use a simple, descriptive keyword like prompt in the parameters section of your training setup. This token will be used to generate new images of your subject by the model.

Parameters

class autotrain.trainers.dreambooth.params.DreamBoothTrainingParams

< >

( model: str = None vae_model: typing.Optional[str] = None revision: typing.Optional[str] = None tokenizer: typing.Optional[str] = None image_path: str = None class_image_path: typing.Optional[str] = None prompt: str = None class_prompt: typing.Optional[str] = None num_class_images: int = 100 class_labels_conditioning: typing.Optional[str] = None prior_preservation: bool = False prior_loss_weight: float = 1.0 project_name: str = 'dreambooth-model' seed: int = 42 resolution: int = 512 center_crop: bool = False train_text_encoder: bool = False batch_size: int = 4 sample_batch_size: int = 4 epochs: int = 1 num_steps: int = None checkpointing_steps: int = 500 resume_from_checkpoint: typing.Optional[str] = None gradient_accumulation: int = 1 disable_gradient_checkpointing: bool = False lr: float = 0.0001 scale_lr: bool = False scheduler: str = 'constant' warmup_steps: int = 0 num_cycles: int = 1 lr_power: float = 1.0 dataloader_num_workers: int = 0 use_8bit_adam: bool = False adam_beta1: float = 0.9 adam_beta2: float = 0.999 adam_weight_decay: float = 0.01 adam_epsilon: float = 1e-08 max_grad_norm: float = 1.0 allow_tf32: bool = False prior_generation_precision: typing.Optional[str] = None local_rank: int = -1 xformers: bool = False pre_compute_text_embeddings: bool = False tokenizer_max_length: typing.Optional[int] = None text_encoder_use_attention_mask: bool = False rank: int = 4 xl: bool = False mixed_precision: typing.Optional[str] = None token: typing.Optional[str] = None push_to_hub: bool = False username: typing.Optional[str] = None validation_prompt: typing.Optional[str] = None num_validation_images: int = 4 validation_epochs: int = 50 checkpoints_total_limit: typing.Optional[int] = None validation_images: typing.Optional[str] = None logging: bool = False )

Parameters

  • model (str) — Name of the model to be used for training.
  • vae_model (Optional[str]) — Name of the VAE model to be used, if any.
  • revision (Optional[str]) — Specific model version to use.
  • tokenizer (Optional[str]) — Tokenizer to be used, if different from the model.
  • image_path (str) — Path to the training images.
  • class_image_path (Optional[str]) — Path to the class images.
  • prompt (str) — Prompt for the instance images.
  • class_prompt (Optional[str]) — Prompt for the class images.
  • num_class_images (int) — Number of class images to generate.
  • class_labels_conditioning (Optional[str]) — Conditioning labels for class images.
  • prior_preservation (bool) — Enable prior preservation during training.
  • prior_loss_weight (float) — Weight of the prior preservation loss.
  • project_name (str) — Name of the project for output directory.
  • seed (int) — Random seed for reproducibility.
  • resolution (int) — Resolution of the training images.
  • center_crop (bool) — Enable center cropping of images.
  • train_text_encoder (bool) — Enable training of the text encoder.
  • batch_size (int) — Batch size for training.
  • sample_batch_size (int) — Batch size for sampling.
  • epochs (int) — Number of training epochs.
  • num_steps (int) — Maximum number of training steps.
  • checkpointing_steps (int) — Steps interval for checkpointing.
  • resume_from_checkpoint (Optional[str]) — Path to resume training from a checkpoint.
  • gradient_accumulation (int) — Number of gradient accumulation steps.
  • disable_gradient_checkpointing (bool) — Disable gradient checkpointing.
  • lr (float) — Learning rate for training.
  • scale_lr (bool) — Enable scaling of the learning rate.
  • scheduler (str) — Type of learning rate scheduler.
  • warmup_steps (int) — Number of warmup steps for learning rate scheduler.
  • num_cycles (int) — Number of cycles for learning rate scheduler.
  • lr_power (float) — Power factor for learning rate scheduler.
  • dataloader_num_workers (int) — Number of workers for data loading.
  • use_8bit_adam (bool) — Enable use of 8-bit Adam optimizer.
  • adam_beta1 (float) — Beta1 parameter for Adam optimizer.
  • adam_beta2 (float) — Beta2 parameter for Adam optimizer.
  • adam_weight_decay (float) — Weight decay for Adam optimizer.
  • adam_epsilon (float) — Epsilon parameter for Adam optimizer.
  • max_grad_norm (float) — Maximum gradient norm for clipping.
  • allow_tf32 (bool) — Allow use of TF32 for training.
  • prior_generation_precision (Optional[str]) — Precision for prior generation.
  • local_rank (int) — Local rank for distributed training.
  • xformers (bool) — Enable xformers memory efficient attention.
  • pre_compute_text_embeddings (bool) — Pre-compute text embeddings before training.
  • tokenizer_max_length (Optional[int]) — Maximum length for tokenizer.
  • text_encoder_use_attention_mask (bool) — Use attention mask for text encoder.
  • rank (int) — Rank for distributed training.
  • xl (bool) — Enable XL model training.
  • mixed_precision (Optional[str]) — Enable mixed precision training.
  • token (Optional[str]) — Token for accessing the model hub.
  • push_to_hub (bool) — Enable pushing the model to the hub.
  • username (Optional[str]) — Username for the model hub.
  • validation_prompt (Optional[str]) — Prompt for validation images.
  • num_validation_images (int) — Number of validation images to generate.
  • validation_epochs (int) — Epoch interval for validation.
  • checkpoints_total_limit (Optional[int]) — Total limit for checkpoints.
  • validation_images (Optional[str]) — Path to validation images.
  • logging (bool) — Enable logging using TensorBoard.

DreamBoothTrainingParams

< > Update on GitHub