File size: 4,484 Bytes
573027e 38f0a65 573027e 38f0a65 f8efefb 5258730 f8efefb 5258730 8471a16 5258730 8471a16 19833db 8471a16 5258730 8471a16 5258730 8471a16 5258730 8471a16 0e3f3c3 5258730 0e3f3c3 5258730 0e3f3c3 5258730 19833db 5258730 19833db 5258730 8471a16 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
---
license: cc-by-nc-4.0
---
# RAVE Models
This is a collection of [RAVE](https://github.com/acids-ircam/RAVE) models trained by the [Intelligent Instruments Lab](https://iil.is) for various projects.
Most of these models are encoder-decoder only, no prior, and all use the `--causal` mode and are exported for streaming inference with [nn~](https://github.com/acids-ircam/nn_tilde), [NN.ar](https://github.com/elgiano/nn.ar) or [rave-supercollider](https://github.com/victor-shepardson/rave-supercollider).
## Musical Instruments
### guitar_iil_b2048_r48000_z16.ts
Dataset: [IILGuitarTimbre](https://github.com/Intelligent-Instruments-Lab/IILGuitarTimbre), a timbre-oriented collection of plucking, strumming, striking scraping and more recorded dry from an electric guitar.
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### sax_soprano_franziskaschroeder_b2048_r48000_z20.ts
Dataset: Soprano sax improvisation by [Franziska Schroeder](https://improvisationai.wordpress.com/).
Model: modified RAVE v1, 48kHz, block size 2048, 20 latent dimensions.
### organ_archive_b2048_r48000_z16.ts
Dataset: various recordings of organ music sourced from archive.org. Small amounts of voice and other instruments were included, and vinyl record noises are prominent.
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### organ_bach_b2048_sr48000_z16.ts
Dataset: various recordings of J.S. Bach music for church organ.
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
## Voice
### voice_vocalset_b2048_r48000_z16.ts
Dataset: [VocalSet](https://zenodo.org/record/1193957) singing voice dataset.
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### voice_hifitts_b2048_r48000_z16.ts
Dataset: [Hi-Fi TTS](http://arxiv.org/abs/2104.01497) audiobooks dataset.
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### voice_jvs_b2048_r44100_z16.ts
Dataset: [Hi-Fi TTS](http://arxiv.org/abs/2104.01497) speaker 9017 (John Van Stan).
Model: RAVE v3, 44.1kHz, block size 2048, 16 latent dimensions.
### voice_vctk_b2048_r44100_z16.ts
Dataset: [CSTR VCTK Corpus](https://datashare.ed.ac.uk/handle/10283/3443) multispeaker read speech dataset.
Model: RAVE v3, 44.1kHz, block size 2048, 22 latent dimensions.
## Birds
### birds_motherbird_b2048_r48000_z16.ts
This model of bird sounds was curated by Manuel Cherep, Jessica Shand and Jack Armitage for their piece Motherbird, performed at TENOR 2023 in Boston, May 2023.
Dataset: bird sounds.
Model: RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### birds_pluma_b2048_r48000_z12.ts
This model of bird sounds was curated by Giacomo Lepri for his instrument *[Pluma](http://www.giacomolepri.com/pluma)*
Dataset: bird sounds.
Model: modified RAVE v1, 48kHz, block size 2048, 12 latent dimensions.
## *Pond Brain* Marine Sounds
These models of marine sounds were trained for [Jenna Sutela](https://jennasutela.com/)'s *Pond Brain* installations at [Copenhagen Contemporary](https://copenhagencontemporary.org/en/yet-it-moves-read-online/) and the [Helsinki Biennial](https://helsinkibiennaali.fi/en/artist/jenna-sutela/)
Caution: these decoders sometimes produce a loud chirp on first initialization.
### water_pondbrain_b2048_r48000_z16.ts
Dataset: water recordings from freesound.org.
<details>
<summary>list of freesound users</summary>
inspectorj, inchadney, aesqe, vonfleisch, javetakami, atomediadesign, kolezan, zabuhailo, zaziesound, repdac3, al_sub, lgarrett, uzbazur, lydmakeren, frenkfurth, edo333, boredtoinsanity, owl, kaydinhamby, tliedes, ilmari_freesound, manoslindos, l3ardoc, alexbuk, s-light
</details>
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### humpbacks_pondbrain_b2048_r48000_z20.ts
Dataset: humpback whale recordings from the [Watkins database](https://cis.whoi.edu/science/B/whalesounds/index.cfm), [MBARI](https://freesound.org/people/MBARI_MARS/), and BBC.
Model: modified RAVE v1, 48kHz, block size 2048, 20 latent dimensions.
### marinemammals_pondbrain_b2048_r48000_z20.ts
Dataset: various marine mammal sounds from [NOAA](https://www.fisheries.noaa.gov/national/science-data/sounds-ocean-mammals), the [Watkins database](https://cis.whoi.edu/science/B/whalesounds/index.cfm), freesound users `felixblume` and `geraldfiebig`, and sound effects databases.
Model: modified RAVE v1, 48kHz, block size 2048, 20 latent dimensions.
|