Spaces:
Sleeping
Sleeping
File size: 4,498 Bytes
41b9d24 ed26e4e 41b9d24 ee4b45b 41b9d24 9c153c6 41b9d24 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 |
---
title: salad bowl (vampnet)
emoji: 🥗
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 4.43.0
python_version: 3.9.17
app_file: app.py
pinned: false
license: cc-by-nc-4.0
---
# VampNet
This repository contains recipes for training generative music models on top of the Descript Audio Codec.
# Setting up
**Requires Python 3.9**.
you'll need a Python 3.9 environment to run VampNet. This is due to a [known issue with madmom](https://github.com/hugofloresgarcia/vampnet/issues/15).
(for example, using conda)
```bash
conda create -n vampnet python=3.9
conda activate vampnet
```
install VampNet
```bash
git clone https://github.com/hugofloresgarcia/vampnet.git
pip install -e ./vampnet
```
# Usage
quick start!
```python
import random
import vampnet
import audiotools as at
# load the default vampnet model
interface = vampnet.interface.Interface.default()
# list available finetuned models
finetuned_model_choices = interface.available_models()
print(f"available finetuned models: {finetuned_model_choices}")
# pick a random finetuned model
model_choice = random.choice(finetuned_model_choices)
print(f"choosing model: {model_choice}")
# load a finetuned model
interface.load_finetuned(model_choice)
# load an example audio file
signal = at.AudioSignal("assets/example.wav")
# get the tokens for the audio
codes = interface.encode(signal)
# build a mask for the audio
mask = interface.build_mask(
codes, signal,
periodic_prompt=7,
upper_codebook_mask=3,
)
# generate the output tokens
output_tokens = interface.vamp(
codes, mask, return_mask=False,
temperature=1.0,
typical_filtering=True,
)
# convert them to a signal
output_signal = interface.decode(output_tokens)
# save the output signal
output_signal.write("scratch/output.wav")
```
## Launching the Gradio Interface
You can launch a gradio UI to play with vampnet.
```bash
python app.py --args.load conf/interface.yml --Interface.device cuda
```
# Training / Fine-tuning
## Training a model
To train a model, run the following script:
```bash
python scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints
```
for multi-gpu training, use torchrun:
```bash
torchrun --nproc_per_node gpu scripts/exp/train.py --args.load conf/vampnet.yml --save_path path/to/ckpt
```
You can edit `conf/vampnet.yml` to change the dataset paths or any training hyperparameters.
For coarse2fine models, you can use `conf/c2f.yml` as a starting configuration.
See `python scripts/exp/train.py -h` for a list of options.
## Debugging training
To debug training, it's easier to debug with 1 gpu and 0 workers
```bash
CUDA_VISIBLE_DEVICES=0 python -m pdb scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints --num_workers 0
```
## Fine-tuning
To fine-tune a model, use the script in `scripts/exp/fine_tune.py` to generate 3 configuration files: `c2f.yml`, `coarse.yml`, and `interface.yml`.
The first two are used to fine-tune the coarse and fine models, respectively. The last one is used to launch the gradio interface.
```bash
python scripts/exp/fine_tune.py "/path/to/audio1.mp3 /path/to/audio2/ /path/to/audio3.wav" <fine_tune_name>
```
This will create a folder under `conf/<fine_tune_name>/` with the 3 configuration files.
The save_paths will be set to `runs/<fine_tune_name>/coarse` and `runs/<fine_tune_name>/c2f`.
launch the coarse job:
```bash
python scripts/exp/train.py --args.load conf/generated/<fine_tune_name>/coarse.yml
```
this will save the coarse model to `runs/<fine_tune_name>/coarse/ckpt/best/`.
launch the c2f job:
```bash
python scripts/exp/train.py --args.load conf/generated/<fine_tune_name>/c2f.yml
```
## A note on argbind
This repository relies on [argbind](https://github.com/pseeth/argbind) to manage CLIs and config files.
Config files are stored in the `conf/` folder.
### Take a look at the pretrained models
All the pretrained models (trained by hugo) are stored here: https://huggingface.co/hugggof/vampnet
### Licensing for Pretrained Models:
The weights for the models are licensed [`CC BY-NC-SA 4.0`](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.ml). Likewise, any VampNet models fine-tuned on the pretrained models are also licensed [`CC BY-NC-SA 4.0`](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.ml).
Download the pretrained models from [this link](https://zenodo.org/record/8136629). Then, extract the models to the `models/` folder.
|